Creating a search site isn't terribly difficult nowadays, what with the plethora of search APIs provided by the well-known engine companies. However, making your own engine is just as difficult as it has always been.
I wanted to ask the community for any insight on such an endeavor.
Using a third party API to execute an internet search would, of course, sound like the best option to get a search site going with minimal issue, but the opposite has been the case thus far.
Google seems to have dropped support for web searches via their API... which is a shame since there's no question that they are the leader in favorable search results.
I have used the Yahoo BOSS V1 API with minimal issues, however they have discontinued support for it, in favor of their new BOSS V2 API brought about by their conjunction with Bing, which has been little more than a hassle. While setting it up is easy enough, their support has been lacking, allowing certain portions of the service to go offline far too frequently, failing to live up to a 99.9% claimed up-time. On top of this, results from their API are far from matching results obtained directly from Yahoo, which throws into question why they want me to pay for a service that is obviously lacking.
I have also had experience using Bing's own API, which was less favorable due to the generally poor results from their engine. This was quite a while ago, though, and research into using them is something on my mind.
All that aside, there's no question that an in-house engine would be best, as up-time and result favorability would be controlled directly and not reliant on third party support teams which are no doubt in over their heads with other support tickets from various other sites using their technology.
The first step to setting up an engine, as far as I can tell, would be to start up a crawler that would venture across the internet, saving pertinent information and accumulating a vast database that would be used later by the engine itself.
I don't have
any experience with this personally, which is why I've come to ask if anyone here has any tips for me as far as what the crawler should be looking for, what the best way to set one up would be... and more importantly, if anyone here has any experience as an SEO that could share some bits of wisdom in the nitty gritty of how a search engine is designed and functions. Any articles on the topic (not discussion about the technology, but about how the technology works, in-depth if at all possible) would be appreciated as well.
This is a planning phase, hashing out the extremities of such an endeavor, finding out as much as I can about the technology, so that I can make an educated decision on whether or not this route is one worth taking at this point in time.
~Plystire
A rose is only a rose until it is held and cherished -- then it becomes a treasure.