mamcx Guest
|
Posted: Wed Jul 27, 2005 9:09 pm Post subject: Help in my project tecnology decisions... |
|
|
I'm building a search engine (index server + crawler(s) + search
interface) for the small-mid size company (2-200 computers?). Is for
Latin-america marketplace but still I can get here valuable input.
Suppose I have all the building blocks, or can buy anything necesary.
Suppose I have a solid marketing/bussines plan (ha!) and that stuff. If
a customer have a company with a network + computers. Forgetting about
astronautic features (like claim to have IA and that stuff) so, I want
to know what is a *must* for version 1... (I draw some things, build a
list of desires, etc... Imposible to do in less than 3 years!)... so I
cut the list... and this is my basic overview:
(Please add + if you think is ok, - if can wait)
* Security
- Integrated windows security
- The comunication indexserver + webpage is protected with SSL by
default (the end-user not need buy a certificate)
- The user only can see the files that are permitted to her
* Privacy
- The END USER can mark some file/folder/email folder like private, and
include the administrator not can see it?
* Easy of use
- Focus is provide a above of the average search interface with Ajax
technology?
- Use diferent layouts in returned items (ej: user search "mountain".
Get a word document, a jpg image and a mp3 file. Have a layout
"document", "image","music" and apply to each item...
- Tabbed meta-categories (ie: All, Files, Music, etc...)
- Provide clustering search (like http://clusty.com/) ??
- Get a little toolbar under each returned item with: Get file, preview,
cache data, properties, convert to pdf/html???
* Features:
- Search files (Office, open office, html, text files, jpg, gif, png,
mp3, vma, etc...)
- The MAIN search interface is a clasic webpage (with a little of ajax)
- The SECUNDARY search interface is a virtual windows folder
- Search email folders (Outlook+Outlook express)
- Search intranet website ?
- Group duplicated info/files? (i.e. File foo.doc was send by email to
all employes. In results, not put each foo.doc (because is in diferent
computers) but simply get one then say how much is duplicated)
- Use system file notifications to get fresh content when the user
change a file
* Others
- Support Win2000+. Permit crawl of Win95-98/Linux computers from a
Win2000+ computer, using a share directory
- Web interface
This is something I'm undecided... in what build the website (asp,
asp.net, php???) I want the most easy deploy possible, so despite the
fact I'm very confortable with asp.net, I want a single file-copy setup,
and avoid if possible any OS upgrade in the way... and if I go the ajax
way, I think that the work is more in the client-side than in the web
server..
- Deploy website to IIS by default
- Have a apache option??? (ie, like a embebed web server for small shops
that not know how setup properly IIS????)
- The index server is a self-contained web service application (I'm
inclined to build this with Remobjects and deploy it like a windows service)
Something else, a *must* for a decent version 1, taking in account that
free desktop version exist and some search engines like copernic are
already in the market?
--
Mutis: The open source indexing/search engine for Delphi
http://mutis.sourceforge.net/
(Alpha stage: Developers Wanted!)
|
|