Information Retrieval Blog » information Retrieval http://blog.zye.me ANTI-GFW Sun, 29 Aug 2010 03:59:54 +0000 http://wordpress.org/?v=2.9.1 en hourly 1 What’s Google doing in search? c10088bc http://blog.zye.me/2010/01/55471.html http://blog.zye.me/2010/01/55471.html#comments Mon, 25 Jan 2010 18:09:20 +0000 Jeffye http://blog.so8848.com/2010/01/55471.html 1. Interesting highlighting in search results or snippets.

2. synonym expansion — query expansion

3. Social Search in Google labs

4. Google Squared

extract interesting facts from WEB page, and present them in meaningful way to you

5. real-time search

Related PostsPapers Written by Googlerscontent based image retrieval (CBIR) toolkits and packageThe 2008 google Founders’ Letter Posted by [...]]]>
http://blog.zye.me/2010/01/55471.html/feed 0
content based image retrieval (CBIR) toolkits and package http://blog.zye.me/2009/05/52060.html http://blog.zye.me/2009/05/52060.html#comments Sat, 23 May 2009 01:13:07 +0000 jeffye http://blog.so8848.com/?p=52060 present several tools for CBIR. Unfortunately, these tools are all lack of documents. I choose LIRE since I am familiar with lucene and Java.

Do you have any other good choices? If you do have, please comment. Thanks.

The LIRE (Lucene Image REtrieval) library (Java based)

It is a CBIR system based on Lucene (Java-based) and Caliph and [...]]]> http://blog.zye.me/2009/05/52060.html/feed 0 Global Ranking http://blog.zye.me/2009/05/51966.html http://blog.zye.me/2009/05/51966.html#comments Sun, 17 May 2009 23:20:58 +0000 jeffye http://blog.so8848.com/2009/05/51966.html Backup Links

    Sent to you by Jeffye via Google Reader:     Global Ranking via Research on Search by Dell Zhang on 5/17/09

Global Ranking looks a promising direction in the research area of Learning to Rank for Information Retrieval.

[1] Global Ranking Using Continuous Conditional Random Fields[2] Global Ranking by Exploiting User Clicks

    Things you can do from here: Subscribe to Research on [...]]]>
http://blog.zye.me/2009/05/51966.html/feed 0
信息检索领域主要期刊和会议 http://blog.zye.me/2009/04/51385.html http://blog.zye.me/2009/04/51385.html#comments Mon, 27 Apr 2009 16:14:07 +0000 jeffye http://blog.so8848.com/?p=51385 Journals

TOIS – ACM Transactions on Information Systems Publication: 467  Citation: 13992 IPM – Information Processing and Management Publication: 2142  Citation: 9622

JASIS – Journal of the American Society for Information Science and Technology Publication: 2559  Citation: 10127 SIGIR Forum Publication: 900  Citation: 3378 IR – Information Retrieval Publication: 238  Citation: 1400

Conferences

SIGIR – Research and Development in Information Retrieval Publication: 2304  Citation: 28040 TREC – Text REtrieval [...]]]> http://blog.zye.me/2009/04/51385.html/feed 0 Live Lab released Image Preference Dataset http://blog.zye.me/2009/03/50497.html http://blog.zye.me/2009/03/50497.html#comments Sun, 22 Mar 2009 16:20:02 +0000 jeffye http://blog.so8848.com/?p=50497 Live Lab released Image Preference Dataset

To promote research on preferences in the search and machine learning communities, Live Labs is releasing new anonymized data from a project we did last year.  The Picture This game randomly pairs two users in a game setting.  The game selects queries, then the players are shown two or more images.  They [...]]]> http://blog.zye.me/2009/03/50497.html/feed 0 BM系列(如Okapi BM25)Weighting 公式介绍及文献– BM family weighting scheme Introduction and important Literaturesuu http://blog.zye.me/2009/02/49794.html http://blog.zye.me/2009/02/49794.html#comments Thu, 26 Feb 2009 22:59:14 +0000 jeffye http://blog.so8848.com/2009/02/49794.html BM family weighting scheme Introduction and important Literatures

1. Okapi  BM25 是IR领域中一个非常重要的 Ranking 公式,bm的意思是best match, 理论基础为 Probabilistic Theory, 由  Stephen E. Robertson 在1970s发明, 也是Robertson 教授的成名作,奠定他在IR领域崇高地位。

//////////////Okapi  bm25 formula/////////////////////

double K = k_1 * ((1 – b) + b * docLength / averageDocumentLength) + tf;

return Idf.log((numberOfDocuments – n_t + 0.5d) / (n_t+ 0.5d)) * [...]]]> http://blog.zye.me/2009/02/49794.html/feed 0 看看IR(信息检索)领域的大牛们 http://blog.zye.me/2008/11/44203.html http://blog.zye.me/2008/11/44203.html#comments Thu, 13 Nov 2008 03:51:45 +0000 jeffye http://www.5yiso.cn/2008/11/44203.html http://blog.zye.me/2008/11/44203.html/feed 0 机器学习与人工智能学习资源导引 http://blog.zye.me/2008/11/43780.html http://blog.zye.me/2008/11/43780.html#comments Sat, 01 Nov 2008 15:35:26 +0000 jeffye http://www.5yiso.cn/2008/11/43780.html 机器学习与人工智能学习资源导引 This article if from: http://blog.csdn.net/pongba/archive/2008/09/11/2915005.aspx

里面推荐很多书的确非常经典,其中我看过的有两本,1. Stanford的那本《Introduction to Information Retrieval》 2. Bishop, 《Pattern Recognition and Machine Learning》,非常值得一读,网上都能找到电子版的,如果实在找不到给我留言。《Introduction to Information Retrieval》在我以前的post中已经给出下载链接,另外一本电子太大没地方能上传。

———————————————————————————————–

我经常在 TopLanguage 讨论组上推荐一些书籍,也经常问里面的牛人们搜罗一些有关的资料,人工智能、机器学习、自然语言处理、知识发现(特别地,数据挖掘)、信息检索 这些无疑是 CS 领域最好玩的分支了(也是互相紧密联系的),这里将最近有关机器学习和人工智能相关的一些学习资源归一个类:

首先是两个非常棒的 Wikipedia 条目,我也算是 wikipedia 的重度用户了,学习一门东西的时候常常发现是始于 wikipedia 中间经过若干次 google ,然后止于某一本或几本著作。

第一个是“人工智能的历史”(History of Artificial Intelligence),我在讨论组上写道:

而今天看到的这篇文章是我在 wikipedia 浏览至今觉得最好的。文章名为《人工智能的历史》,顺着 AI 发展时间线娓娓道来,中间穿插无数牛人故事,且一波三折大气磅礴,可谓”事实比想象更令人惊讶”。人工智能始于哲学思辨,中间经历了一个没有心理学(尤其是认知神经科学的)的帮助的阶段,仅通过牛人对人类思维的外在表现的归纳、内省,以及数学工具进行探索,其间最令人激动的是 Herbert Simon (决策理论之父,诺奖,跨领域牛人)写的一个自动证明机,证明了罗素的数学原理中的二十几个定理,其中有一个定理比原书中的还要优雅,Simon 的程序用的是启发式搜索,因为公理系统中的证明可以简化为从条件到结论的树状搜索(但由于组合爆炸,所以必须使用启发式剪枝)。后来 Simon 又写了 GPS (General Problem Solver),据说能解决一些能良好形式化的问题,如汉诺塔。但说到底 Simon 的研究毕竟只触及了人类思维的一个很小很小的方面 —— Formal [...]]]> http://blog.zye.me/2008/11/43780.html/feed 0 A Guide to Information Retrieval http://blog.zye.me/2008/08/39116.html http://blog.zye.me/2008/08/39116.html#comments Wed, 06 Aug 2008 07:49:06 +0000 jeffye http://www.5yiso.cn/2008/08/39116.html 信息检索领域相关资料

———————

Books

+ Finding Out About: Search Engine Technology from a cognitive

Perspective (Belew, R.K., 2000)

http://www-cse.ucsd.edu/~rik/foa/

+ Foundations of Statistical Natural (C. Manning and H. Schutze, 1999)

+ Information Retrieval, 2nd edition (C.J. van Rijsbergen, 1979)

(full text)

http://www.dcs.gla.ac.uk/Keith/Preface.html

+ Information Retrieval: A Survey (Ed Greengrass, 2000)

http://www.csee.umbc.edu/cadip/readings/IR.report.120600.book.pdf

+ Information Retrieval: Data Structures & Algorithms

(Frakes, W. and Baeza-Yates, R., 1992)

http://www.dcc.uchile.cl/~rbaeza/iradsbook/irbook.html

+ Information Retrieval [...]]]> http://blog.zye.me/2008/08/39116.html/feed 0 Papers Written by Googlers http://blog.zye.me/2008/04/27952.html http://blog.zye.me/2008/04/27952.html#comments Tue, 15 Apr 2008 08:40:54 +0000 jeffye http://www.5yiso.cn/2008/04/27952.html Google公布了很多他们研究的papers,感觉很多非常不错。下面是链接

Below is a partial list of publications by people after joining Google, organized by category. There is also a list  organized by year , and an atom feed is also available.

Categories Algorithms and Theory (151) Artificial Intelligence and Data Mining (72) Audio, Video, and Image Processing (79) Distributed Systems and Parallel Computing (117) Education (5) General Science (22) Human-Computer Interaction and [...]]]>
http://blog.zye.me/2008/04/27952.html/feed 0