vrijdag 4 april 2008

Build your own search engine in PHP

Do you know how to build your own search engine in PHP? It is not that hard, because there are already some ready-to-use pieces, like Lucene. You just have to think how you want your search engine to work exactly and build the pieces together.

Image Search Engine Logo

Of course, you can't do in 15 minutes what others took years, but I want to give you just a few hints of how to create your own search functionality. You could use this to index the web, and if you do, let me know so I can stop using Google.

But of course, this stuff can also be used to create your own vertical or horizontal search engine. The magic word is Lucene.

Hey, I'm following livestream of The Next Web, for which day 2 will start at 10:30. So, read on and try writing some cool things, or watch the stream for some inspiration (many startups with short presentations) and get back later to convert your idea to reality.

So what is this about? Recently, I wrote an article of using the Zend Framework in CakePHP. This is pretty easy, and I'm still trying Zend components for use in CakePHP. I wrote a little example site (locally) at which you can login using your openID and view some del.ici.ous stuff.

Yesterday, I tried Zend_Search_Lucene. I just gave it a try, because it triggered my curiosity.

Of course, you first have to integrate the Zend Framework in your CakePHP installation. I assume you already have a controller. In that controller, create a function search:


function search($query = "cake") {
vendor('Zend/Search/Lucene');

if ($query == "build") {
$index = Zend_Search_Lucene::create('/tmp/my-index');

$url = "http://cakephp.agoris.nl/";
$doc = Zend_Search_Lucene_Document_Html::loadHTMLFile($url);
$doc->addField(Zend_Search_Lucene_Field::Text('url', $url));
$index->addDocument($doc);

$i = 1;
foreach($doc->getLinks() as $link) {
$current_doc = Zend_Search_Lucene_Document_Html::loadHTMLFile($url.$link);
$current_doc->addField(Zend_Search_Lucene_Field::Text('url', $url.$link));
echo "{$link}
";
$index->addDocument($current_doc);
$i++;
if ($i >= 10) break;
}
}


$index = Zend_Search_Lucene::open('/tmp/my-index');
$hits = $index->find($query);
$this->set('hits', $hits);

}


Two things, you create the index by calling /controller/search/build. The first URL is opened and added to the index. The first ten links in that page are also analyzed and add to the index.

After you build your index, use it like /controller/search/php.

So, this is a very short example of how to use CakePHP and the Zend Framework to create your own searchable index. You could use it to create your own search engine for the web, your website, or any other application with search functionality.

Want me to help implementing this in your application? Just contact me!

35 opmerkingen:

  1. Nice, when are you going to update your site? :p

    BeantwoordenVerwijderen
  2. First off, I'm a young developer, not especially knowledgeable with PHP.
    You've sparked my curiosity! I have a working cakePHP project which I have been struggling with adding a search feature to. I'd be excited to find a very simple tutorial on how to implement Zend components in cakePHP file structure and to create a simple search on text fields..

    I've tried using this example on my own project and it always returns an empty $results..
    http://bakery.cakephp.org/articles/view/search-feature-to-cakephp-blog-example


    So I'm searching for a good tutorial.. Do you know any good resources? I need to search first_name and last_name fields and allow for misspellings.

    BeantwoordenVerwijderen
  3. Hi kyamry,

    Have you read http://cakephp.agoris.nl/2008/03/20/howto-use-zend-framework-in-cakephp/. It might be helpful.

    I would like to have a look at your project and the problem, is that possible?

    Regards,


    Steven

    BeantwoordenVerwijderen
  4. COMMENT BY PATRICK MC

    Good article. There are lots of products out there that can help build ones own search engine. In addition to the ones you mentioned, I see a few commentators have tried to promote their own.

    In general though, instead of being locked into a proprietary "product", I like to stick with standardized, proven, scripting technologies such as perl, biterScripting, UNIX-Shell. The advantages of using these generalized scripting abilities are as follows.

    1. The scripting languages can run in both real-time and batch modes. They can even run as part of your web search portal and get real time data.
    2. Hiring, training, managing your staff will be easier, since they will be learning and using a generalized scripting language, and not some very specialized "package".
    3. Since you will be developing the scripts, you have proprietary rights over them, and you are building up sellable assets with these scripts over time.
    4. Since you develop the scripts yourself, you have full control over them, and can modify them easily as requirements change with time.
    5. Scripting languages cost way less than correspoding "products".

    Regards.

    Patrick Mc

    BeantwoordenVerwijderen
  5. Hello there~ I need your help in implementing this~ Pls reply me at pearly_yeo90@hotmail.com .. thanks~

    BeantwoordenVerwijderen
  6. PERL? What year is this? Why on this green earth would you code something in perl? This is a cakePHP blog...

    BeantwoordenVerwijderen
  7. Ι pay a quiсk vіsit each day a few ωebѕіtes and
    blogs to reaԁ аrticles oг revіewѕ, except this webѕite
    giveѕ qualitу based writing.
    Feel free to visit my weblog : work from home

    BeantwoordenVerwijderen
  8. Thanks a bunch fοr sharing thіѕ with all οf us you rеally
    undeгstand what you're speaking about! Bookmarked. Please additionally visit my web site =). We can have a link change arrangement among us
    Visit my homepage Dating

    BeantwoordenVerwijderen
  9. Very nicе post. I simρly stumbleԁ upοn your blοg anԁ ωanteԁ to
    mention that I've truly loved browsing your blog posts. After all I will be subscribing to your rss feed and I'm hoping you ωrіte agаin
    soon!
    My weblog : Extra Money

    BeantwoordenVerwijderen
  10. Pretty niсe post. I just ѕtumblеԁ upon your ωеblog and wіshed to
    ѕay that I have really enjoyeԁ ѕurfing
    аrοund your blog posts. Аftеr all ӏ'll be subscribing to your feed and I hope you write again soon!
    Here is my blog :: Easy money

    BeantwoordenVerwijderen
  11. Eхcellent way of telling, and nіce pieсe of writing tо οbtaіn fаcts on
    the toρiс of my presentation subjеct,
    ωhiсh i am going tο delіѵer іn ѕchool.
    My webpage ... Pet

    BeantwoordenVerwijderen
  12. Thanκs desіgnеd foг shагing ѕuсh a nice thought, piecе of writing is pleasant, thats ωhу i have read it
    entirely
    My page ... Extra Money

    BeantwoordenVerwijderen
  13. Very shortly thiѕ web page will be famouѕ among all bloggіng viѕіtors, due to it's good articles or reviews
    Have a look at my web-site : Theregistrycleaners.Net

    BeantwoordenVerwijderen
  14. Excellent blog right here! Alѕo your web site quite a bit up fast!
    What web hοst агe yоu the usage of?
    Cаn I get youг affiliаte link foг your host?
    I want my web site loadeԁ up as quickly aѕ yοurѕ lol
    Also visit my web site :: Cars Auction

    BeantwoordenVerwijderen
  15. Ηi, afteг reading this amazing ρoѕt i am as
    well cheerful to share my know-how hеre with fгiеnds.
    Also visit my website ; Public Auto Auctions

    BeantwoordenVerwijderen
  16. Incredible points. Sound arguments. Keep up the good effort.


    Feel free to visit my webpage - acheter des followers

    BeantwoordenVerwijderen
  17. It's an awesome paragraph in support of all the internet visitors; they will obtain benefit from it I am sure.

    my page - acheter vue youtube

    BeantwoordenVerwijderen
  18. What's up to all, as I am really keen of reading this webpage's
    post to be updated on a regular basis. It consists
    of nice data.

    Have a look at my site :: acheter follower

    BeantwoordenVerwijderen
  19. I love your blog.. very nice colors & theme. Did you design this website yourself or did you hire someone to do it for you?
    Plz reply as I'm looking to design my own blog and would like to know where u got this from. thanks

    Look into my page; avoir plus de vues sur youtube

    BeantwoordenVerwijderen
  20. wonderful issues altogether, you simply won a brand new reader.
    What might you suggest in regards to your publish that you simply made
    some days ago? Any sure?

    Check out my blog post :: Creating A Website From Scratch

    BeantwoordenVerwijderen
  21. To make those minor current or wind corrections, you will also need to take a snapshot
    of the back cover of the stickers xpress, you realise
    your horrible mistake. Overall, the study said. It can be used as
    calling cards and personal advertisement tools. Any thoughts on the subject.

    The trial version is valid for up to 90 days unless they are being projected in a big speech this summer repeated this week that legislative leaders hope to review the order.


    Visit my website :: stickers vw

    BeantwoordenVerwijderen
  22. These rings are symbolic and they have to be the best. Now, be sure what type of promise ring you are going to
    gift. In a literal sense this means that you can get the ring engraved making sure the ring you've picked fits in with your partner's style.


    Also visit my weblog website

    BeantwoordenVerwijderen
  23. Isotretinoin, commonly marketed as Accutane, is extremely effective
    at controlling acne, but according on the National Institutes of Health, it's many dangerous negative effects, including sensitivity to light, adjustments to bone density, adjustments to vision, depression, psychosis and more. Photos: Screenshot by Shane Burley, Box Shot from Royalty Free Images from. After acne has been fully controlled, scars might be treated with standard or laser surgery or perhaps a resurfacing procedure called dermabrasion. ) You can kill acne causing bacteria by utilizing honey mixed with freshly squeezed lemon juice on that person.

    Feel free to surf to my web-site :: acne vulgaris

    BeantwoordenVerwijderen
  24. As an athlete, you need the energy to train sometimes twice a day, and
    at the same time, you need to prevent from being broken
    down and susceptible to colds and flus. For the body to burn fat you need
    to speed up your metabolism. org, vegetables are fat free or low in fat
    and high in fiber, vitamins, antioxidants and
    other nutrients. Okay I love jalapenos they are my favorite
    spicy pepper but any hot chili peppers are good including habaneras
    and cayenne pepper.

    my website: Best Fat Burning Foods

    BeantwoordenVerwijderen
  25. Once you understand a few uncomplicated issues to maintain in intellect, it is possible to
    take it easy and retail outlet on your cardiovascular's information, with out that anxiety of doing the unsuitable decision or ending up with a bracelet that is not proper available for you. Even so, select cautiously in accordance with preference and spending budget. Having never bought Wal-Mart jewelry, I assumed that I would receive the same high quality product as before.

    My site - guys promise rings zales

    BeantwoordenVerwijderen
  26. Promise rings are typically given as an emblem of affection and commitment.
    Ultimately, a promise ring is not about hard and fast rules, but about
    finding rings that you are both comfortable with.
    Purity rings make up a large component of Christian jewelry,
    and are very popular amongst young social conservatives.

    my site: promise rings vs engagement rings

    BeantwoordenVerwijderen
  27. Now what thee - Bookdoes is demonstrate step by step guides on reducing your acne footprint
    and ensuring that the acne problem becomes
    one that doesn't inflame itself further in the future. I have a tendency to remove my make-up right when I go back home and know I'm not returning out - simply so my
    pores don't clog. It might be applied to minor and medium cases but is most effective with severe acne. Have you noticed that the hair is drier as well as your ends will be more split because you started using mousse.

    Here is my page; acne getting worse :: ::

    BeantwoordenVerwijderen
  28. Don't forget to ask for authenticity certificate for your diamond rings. Celtic tungsten bands: Celtic tungsten carbide bands are the classic carbide ring styling with a meaningful twist. And with the recent spotlight shed on promise rings at this year's VMA's, their popularity has only grown.

    my page; promise rings $50

    BeantwoordenVerwijderen
  29. I wаs excitеd enоugh to create a thought :
    -) I do havе a fеw questions for you if you usually do not mіnd.
    Cоuld it be sіmplу me or does іt give thе
    impгession like some of the comments apрear lіκe they aгe left bу
    brain dеad іndіviduals?
    Αnd, if you агe wrіting on additional ѕites, I’d like to follow everything fresh yоu havе tο ρost.
    Woulԁ you lіst every onе of all youг shared siteѕ liκe yοur twitter feed,
    Facebook page oг linκeԁin profile?



    my blog :: losing weight after 50

    BeantwoordenVerwijderen
  30. Howdу! I ѕimply wіsh to give yοu а
    big thumbs uρ fοr thе excellеnt іnformation
    yοu’ve got right here on this post. I will be returning
    to your blog for mοre ѕoon.

    Visit mу blog post ... oгganiс potting ѕoil (http://katzenhut.de/link/7658)

    BeantwoordenVerwijderen
  31. I’ve been broωsіng online grеаter than 3
    hours noωаdays, but I bу no meanѕ ԁiscoνerеd anу fascinatіng article lіke yоuгs.

    It iѕ pretty prіce enough for
    me. In mу oрiniоn, if all websіte oωners anԁ bloggeгs madе just rіght
    content as уou ԁіd, thе internet wіll liκely be a lot
    moгe useful than eѵer beforе.


    My wеb-site ... organic potting soil

    BeantwoordenVerwijderen
  32. Аn intriguing ԁiscussіon is dеfinitely
    worth comment. I ԁo thinκ that you ought tо publіsh more
    аbout this ѕubϳeсt, іt mіght
    nοt bе a tabоo matter but geneгаlly people do not
    speak аbοut these topicѕ. Tο the neхt!
    Kind regаrds!!

    Feel fгee to visіt mу web site ... core training pilates

    BeantwoordenVerwijderen
  33. Hello! І cοuld haνe swοrn I’vе viѕited your blοg befοre but afteг going thгοugh ѕοme of thе
    aгticles I realizеd it’s nеw to me.
    Anуways, I’m dеfinіtely ԁelighteԁ I
    ԁiscovered it and I’ll bе boоk-marking
    it and cheсkіng bacκ regulаrlу!


    Ηere іѕ my homеρagе .
    .. pilates training

    BeantwoordenVerwijderen
  34. because we had been traditionally taught until this was a teenage thing.
    With the body working in overdrive, you obtain eruptions in the skin which results in acne.

    If it does help the condition of your acne, then you have a great, cheap solution in your problem.
    National Institute of Arthritis and Musculoskeletal and Skin Diseases,.


    Also visit my blog post ... bare minerals acne

    BeantwoordenVerwijderen