DelphiFAQ Home Search:

How to update a htdig search engine database

 

commentsThis article has not been rated yet. After reading, feel free to leave comments and rate it.

Question:

I use ht/dig from www.htdig.org to provide a search function on our web site. Every now and then, new pages appear on our site or existing ones get updated. How can the search engine's database get updated?

Answer:

You need to run the script rundig after each significant change. You can also add this command in your crontab table and schedule it for daily execution.

If you simply run rundig it will visit all pages it can find from the start page and rebuild the database completely. This process is called 'crawling' and 'indexing'.

The downside is that during this crawling / indexing your database is not available for search and users of your web site cannot use the search function.

The solution is parameter '-a' for the rundig script. This parameter makes rundig use alternate work files during the crawling and indexing (the alternate work files have an additional extension .work - your file list in the /htdig/db folder will temporarily look like this:

db.docdb
db.docdb.work
db.docs
db.docs.index
db.metaphone.db
db.soundex.db
db.wordlist
db.wordlist.work
db.words.db



Basically, a second copy of the database is built. This keeps the original files to be used by htsearch. After htdig and htmerge are done building the .work database files, rundig will move them into place, replacing the original files.

Read "How do I set up a cron job?" to see how to schedule rundig -a for daily execution.



Comments:

2015-02-01, 01:37:54
anonymous from India  
That's a genuinely imevpssire answer.

 

 

NEW: Optional: Register   Login
Email address (not necessary):

Rate as
Hide my email when showing my comment.
Please notify me once a day about new comments on this topic.
Please provide a valid email address if you select this option, or post under a registered account.
 

Show city and country
Show country only
Hide my location
You can mark text as 'quoted' by putting [quote] .. [/quote] around it.
Please type in the code:

Please do not post inappropriate pictures. Inappropriate pictures include pictures of minors and nudity.
The owner of this web site reserves the right to delete such material.

photo Add a picture: