IS 202 : Information Organization and Retrieval

Administrivia

Teaching Team 

202 Team

Professor Bob Glushko

Email: glushko@sims.berkeley.edu

Website: http://www.sims.berkeley.edu/~glushko/

Office number: (510) 643-2754

TA Kelly Snow

TA Ben Hill

Course Information

School of Information Management and Systems INFOSYS 202

Course Dates: August 30 to December 8, 2005

Lecture Schedule: Tuesday Thursday 10:30am-12:00pm in 202 South Hall

Units: 4

Grading Option: Letter Grade only

Course Texts

Required

The Organization of Information (Library and Information Science Text Series), Arlene Taylor. 2004. ISBN: 1563089696

Available at the campus bookstore

Finding Out About, Richard Belew. 2001. ISBN: 020139829X

Reader

Available at Copy Central, Bancroft

Course Work

Undated :

[Dec 13] FINAL EXAM 

Lecturer: Bob Glushko

Resources

Lecture Notes

August 30 : Tuesday

[Aug 30] Course Overview 

Lecturer: Bob Glushko

Required Readings

"The Library of Babel, from Labyrinths: Selected Stories & Other Writings" Jorge Luis Borges [Reader]

"Darwin's Dangerous Idea (Excerpt from Chapter 2, The Library of Mendel. pp. 107-111)" Daniel C. Dennett [Reader]

"Conduit Metaphor: A Case of Frame Conflict in our Langauge About Language (In Andrew Ortony (ed.), Metaphor and Thought) (skip "semantic pathology" from 176-184)" Michael Reddy [Reader]

"As We May Think" Vannevar Bush [Reader]

"Passive Capture and Ensuing Issues for a Personal Lifetime Store. (Continuous Archival and Retrieval of Personal Experiences CARPE ' 04)" Jim Gemmell, Lyndsay Williams, Ken Wood, Roger Lueder, and Gordon Bell [Reader]

"The Archivist (about Brewster Kahle and internet archive)" Paul Boutin [Reader]

Slate (April 2005)

Resources

Assignment 1: Short writing assignment about themes in first day readings assigned 

Due on September 1

Assignment details

September 1 : Thursday

[Sept 1] How to Think About Information 

Lecturer: Bob Glushko

Required Readings

Chapter 1, Chapter 3 (49-61) of The Organization of Information (Library and Information Science Text Series) [Textbook]

"How Much Information Executive Summary" Peter Lyman, Hal Varian, et al. [Reader]

"Data Modeling for Everyone (Chapter 1)" Sharon Allen [Reader]

Resources

Assignment 1: Short writing assignment about themes in first day readings due 

September 6 : Tuesday

[Sept 6] Information Organization {and,or,vs} Search 

Lecturer: Bob Glushko

Required Readings

Chapter 1, Chapter 4 (105-109, 116-124) of Finding Out About [Textbook]

"User Interfaces and Visualization (in Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval [Section 10.3, The Information Access Process])" Marti Hearst [Reader]

"The design of browsing and berrypicking techniques for the online search interface [read 407-414 carefully, skim or skip the rest]" Marcia Bates [Reader]

Online Review, 13(5) (1989)

"Surf Like a Bushman" Rachel Chalmers [Reader]

New Scientist (11 November 2000)

"The Deep Web: Surfacing Hidden Value (first few pages, up to Figure 2) " Michael Bergman [Reader]

Journal of Electronic Publishing (July 2001)

Resources

September 8 : Thursday

[Sept 8] Concepts and Categories 

Lecturer: Bob Glushko

Required Readings

"The Vocabulary Problem in Human-System Communication" George W. Furnas, Thomas K. Landauer, Louis M. Gomez, and Susan T. Dumais [Reader]

Communications of the ACM, 30(11), 964-971 (1987)

Chapter 11 (297-300) of The Organization of Information (Library and Information Science Text Series) [Textbook]

"Women, Fire, and Dangerous Things. (preface through p 67)" George Lakoff [Reader]

Resources

Assignment 2: Designing a Vocabulary assigned 

Due on September 15

Assignment details

September 13 : Tuesday

[Sept 13] Metadata and Metadata Standards 

Lecturer: Bob Glushko

Required Readings

Chapter 6, Chapter 7 (159-176), Chapter 8 (201-206, 220-224) of The Organization of Information (Library and Information Science Text Series) [Textbook]

"A bibliographic metadata infrastructure for the 21st century" Roy Tennant [Reader]

Resources

September 15 : Thursday

[Sept 15] Controlled Names and Controlled Vocabularies 

Lecturer: Bob Glushko

Required Readings

Chapter 9 (241-247), Chapter 10 (261-282) of The Organization of Information (Library and Information Science Text Series) [Textbook]

"What's in a name?" Brian Farish [Reader]

"What is a controlled vocabulary?" Karl Fast, Fred Liese, and Mike Steckel [Reader]

"Creating a controlled vocabulary" Karl Fast, Fred Liese, and Mike Steckel [Reader]

"Synonym rings and authority files" Karl Fast, Fred Liese, and Mike Steckel [Reader]

"Controlled vocabularies: A glosso-thesaurus" Karl Fast, Fred Liese, and Mike Steckel [Reader]

"W3C XML Schema Datatypes Reference" Rick Jelliffe [Reader]

Resources

Assignment 2: Designing a Vocabulary due 

September 20 : Tuesday

[Sept 20] Classification 

Lecturer: Bob Glushko

Required Readings

Chapter 11 (301-311, 315-320) of The Organization of Information (Library and Information Science Text Series) [Textbook]

"How to make a faceted classification and put it on the web" William Denton [Reader]

"Faucet Facets: A few best practices for designing multifaceted navigation systems" Jeffrey Veen [Reader]

"Ontology is Overated: Categories, Links, and Tags" Clay Shirky [Reader]

Resources

Assignment 3: Faceted Classification assigned 

Due on September 27

Assignment details

September 22 : Thursday

[Sept 22] Ontologies 

Lecturer: Bob Glushko

Required Readings

Chapter 10 (282-289) of The Organization of Information (Library and Information Science Text Series) [Textbook]

Chapter 6 (213-220) of Finding Out About [Textbook]

"Card Sorting" Jakob Nielsen [Reader]

"Rave Reviews: Acquiring Relevance Assessments from Multiple Users" Belew [Reader]

"Ontology 101" Natalya Noy and Deborah McGuinness [Reader]

"WordNet: An Electronic Lexical Database - Introduction & Ch 1 (through at least page 34)" Christiane Felbaum and George Milller [Reader]

Resources

September 27 : Tuesday

[Sept 27] Semantic Web, RDF and OWL 

Lecturer: Bob Glushko

Required Readings

"The Semantic Web" Tim Berners-Lee, James Hendler, and Ora Lassila [Reader]

Scientific American (May 2001)

"RDF Primer" Manola and Miller [Online]

"OWL Web Ontology Language: Use Cases and Requirements. [Sections 1 and 2, p 1-7] " [Online]

Resources

Assignment 3: Faceted Classification due 

Assignment 4: More Faceted Classifications assigned 

Due on October 4

Assignment details

September 29 : Thursday

[Sept 29] Metadata for Multimedia and Non-text Information 

Lecturer: Bob Glushko

Required Readings

"The Language of Images: Enhancing Access to Images by Applying Metadata Schemas and Structured Vocabularies. Introduction to Art Image Access.(Martha Baca, Ed)" Patricia Harping [Reader]

"From Context to Content: Leveraging Context to Infer Media Metadata" Marc Davis, Simon King, Nathan Good, and Risto Sarvas [Reader]

MM'04

"Media streams. In: Readings in Human-Computer Interaction: Toward the Year 2000, ed. Ronald M. Baecker, Jonathan Grudin, William A. S. Buxton, and Saul Greenberg. 854-866. 2nd ed." Marc Davis [Reader]

"Content Management for Electronic Music Distribution" Francois Pachet [Reader]

Communcations of the ACM, 46(4) (2003)

"ID3" [Reader]

Resources

October 4 : Tuesday

[Oct 4] Documents and Document Models 

Lecturer: Bob Glushko

Required Readings

"Document Engineering, Chapter 2, XML Foundations" Glushko and McGrath [Reader]

"Model-driven Application Design for a Campus Calendar Network" Bloodworth and Glushko [Reader]

"XML Too Much of a Good Thing" David Becker [Reader]

Resources

Assignment 4: More Faceted Classifications due 

Assignment 5: Document Types For a Day assigned 

Due on October 11

Assignment details

October 6 : Thursday

[Oct 6] Databases and Data Models (1) 

Guest Lecturer: Ray Larson of SIMS (email, website)

Required Readings

"Introduction (part of SQL for Web Nerds)" Philip Greenspun [Reader]

"Introduction to Relational Databases" Ian Gilfillan [Reader]

Database Journal (24 June 2002)

Resources

October 11 : Tuesday

[Oct 11] Databases and Data Models (2) 

Guest Lecturer: Ray Larson of SIMS (email, website)

Required Readings

"Database Normalization" Ian Gilfillan [Reader]

Database Journal (22 March 2002)

Resources

Assignment 5: Document Types For a Day due 

October 13 : Thursday

[Oct 13] Enterprise Content and Knowledge Management 

Lecturer: Bob Glushko

Required Readings

"Harnessing the Power of Enterprise Content Management" Manuel Barbero and Marcia Douglas [Reader]

CIO (18 May 2005)

"When You Say "KM" What Do You Mean?" Laurie Orlov [Reader]

CIO

"U.S. National Archives and Records Administration. Agency Recordkeeping Requirements: A Management Guide" [Reader]

"Information Technology Controls" [Reader]

"Reuse, repurpose, repackage" John F. Terris [Reader]

Resources

October 18 : Tuesday

[Oct 18] Enterprise Data Management 

Lecturer: Bob Glushko

Required Readings

"Adopting an Information Integration Strategy" Judith Hurwitz [Reader]

CIO

"Business Intelligence Gets Smart(er)" Alice Dragoon [Reader]

CIO (15 September 2003)

"Doing it with Meaning" John Edwards [Reader]

CIO (15 August 2002)

"Berkeley Campus Plan Implementing the UC Requirements for Protection of Computerized Personal Information" [Reader]

"XML Transformation and Metadata Repositories Enable Information Integration (XML 2004)" Gantz [Reader]

Resources

October 20 : Thursday

[Oct 20] Information Architecture 

Lecturer: Bob Glushko

Required Readings

"Information Architecture Wiki: Information Architecture Defined" [Reader]

"Faceted metadata for image search and browsing" [Reader]

"Globalization, Localization, Internationalization and Translation" [Reader]

"Multilingual Web sites: Best practice, guidelines and architectures. Volume 1, through section 4.14" [Reader]

Resources

October 25 : Tuesday

[Oct 25] Multi-enterprise Information Organization and Management 

Lecturer: Bob Glushko

Required Readings

"Search Engine Technology and Digital Libraries" Norbert Lossau [Reader]

D-Lib Magazine, 10(6) (June 2004)

"The Strategy Machine. Chapter 4, The Information Supply Chain" Larry Downes [Reader]

"The Plug and Play economy" Glushko [Reader]

"Content integration for e-business (552-560 [Skip section 4 on "Cohera's solution"])" Michael Stonebraker and Joseph Hellerstein [Reader]

ACM SIGMOD (2001)

"Using the UNSPSC: United Nations Standard Products and Services Code" [Reader]

"Metadata Rules - a Report from the Open Forum on Metadata Registries " Alan Kotok [Reader]

Resources

Assignment 6: Short writing assignment assigned 

Due on November 1

Assignment details

October 27 : Thursday

[Oct 27] Social Information Organization and Management 

Lecturer: Bob Glushko

Required Readings

"Collaborative Knowledge Gardening" Jon Udell [Reader]

"Folksonomies - Cooperative Classification and Communication Through Shared Metadata " Adam Mathes [Reader]

"About MusicBrainz" [Reader]

"Collaborative Filtering" [Reader]

"Who Knows Whom, and Who Knows What? " Susannah Patton [Reader]

CIO (15 June 2005)

Resources

November 1 : Tuesday

[Nov 1] Personal Information Organization and Management; FIRST HALF COURSE REVIEW 

Lecturer: Bob Glushko

Required Readings

"Six Roles of Documents in Professionals' Work" Morten Hertzum [Reader]

"A Few Thoughts on Cognitive Overload. (pages 19-33 [Part 1 of Article])" David Kirsh [Reader]

Intellectica

"Keeping and Re-Finding Information on the Web: What do People Do and What Do They Need?" Harry Bruce, William Jones, and Susan Dumais [Reader]

Resources

Assignment 6: Short writing assignment due 

November 3 : Thursday

[Nov 3] Text Processing; Boolean Models 

Lecturer: Bob Glushko

Required Readings

Chapter 2, Chapter 3 (60-65, 71-85) of Finding Out About [Textbook]

Resources

Assignment 7: Lexis Search (ungraded) assigned 

Due on December 6

Assignment details

November 8 : Tuesday

[Nov 8] Vector Models 

Lecturer: Bob Glushko

Required Readings

Chapter 3 (86-102), Chapter 5 (153-160) of Finding Out About [Textbook]

"Data-driven approaches to information access" Susan Dumais [Reader]

Cognitive Science, 27(3), 491-524 (2003)

Resources

November 10 : Thursday

[Nov 10] Structure-based Models 

Lecturer: Bob Glushko

Required Readings

6 (182-210, 225-233) of Finding Out About [Textbook]

"Web Crawler (from Wikipedia)" [Reader]

"The Anatomy of a Large-Scale Hypertextual Search Engine" Sergey Brin and Lawrence Page [Reader]

Resources

Assignment 8: understanding the calculations behind retrieval models assigned 

Due on November 17

Assignment details

November 15 : Tuesday

[Nov 15] No Class  

[Nov 14] Search Engines: Multimedia Search and Retrieval 

Guest Lecturer: Brad Horowitz

Required Readings

"The Image User and the Search for Images. Introduction to Art Image Access (Martha Baca, Ed)" Christine Sundt [Reader]

"Google, Meet Tivo" [Reader]

Economist (9 June 2005)

"Bridging the Semantic Gap in Content Management Systems: Computational Media Aesthetics" Chitra Dorai and Svetha Venkatesh [Reader]

Proceedings of COSIGN 2001: Computational Semiotics

"Audio Retrieval by Dynamic Similarity" Jonathan Foote, Matthew Cooper, and Unjung Nam [Reader]

Proceedings of the International Conference on Music Information Retrieval, 2002.

Resources

November 17 : Thursday

[Nov 17] Probabilistic Models 

Guest Lecturer: Ray Larson of SIMS (email, website)

Required Readings

Chapter 5 (167-179), Chapter 7 (252-266) of Finding Out About [Textbook]

"A Plan for Spam (about Bayesian categorization)" Paul Graham [Reader]

Resources

Assignment 8: understanding the calculations behind retrieval models due 

November 22 : Tuesday

[Nov 22] User Interfaces for IR 

Guest Lecturer: Marti Hearst of SIMS (email, website)

Required Readings

"User Interfaces and Visualization (in Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval)" Marti Hearst [Reader]

Resources

November 24 : Thursday : Thanksgiving

Holiday: [Nov 24] Thanksgiving 

November 29 : Tuesday

[Nov 29] Applied IR and Natural Language Processing [1] 

Lecturer: Bob Glushko

Required Readings

"Smart Tools" [Reader]

Business Week (Spring 2003)

"Introduction to the Special Issue on Summarization" Dragomir Radev, Eduard Hovy, and Kathleen McKeown [Reader]

Computational Linguistics, 28(4) (2002)

"Introduction to the Special Issue on the Web as Corpus" Adam Kilgarriff and Gregory Grefenstette [Reader]

Computational Linguistics, 29(3) (2003)

Resources

Assignment 9: Applying IO, IR, and NLP in the Real World assigned 

Due on December 8

Assignment details

December 1 : Thursday

[Dec 1] Applied IR and Natural Language Processing [2] 

Lecturer: Bob Glushko

Required Readings

"Siebel Brightware eService" [Reader]

"Web question answering: Is more always better? (SIGIR '02)" Susan Dumais, Michele Banko, Eric Brill, Jimmy Lin, Andrew Ng [Reader]

"Which Semantic Web? (ACM Hypertext 2003)" Catherine Marshall and Frank Shipman [Reader]

Resources

December 6 : Tuesday

[Dec 6] The Business and Professions of IO and IR 

Guest Lecturer: TBD - SIMS students in classes of 2004 and 2005 of SIMS

Resources

Assignment 7: Lexis Search (ungraded) due 

December 8 : Thursday

[Dec 8] Course Review 

Lecturer: Bob Glushko

Resources

Assignment 9: Applying IO, IR, and NLP in the Real World due 

December 13 : Tuesday

Final Exam 

9:00am-12:00pm, 202 South Hall

last updated on 2005-08-16 by cjjones