IN350 Document Management and Information Steering

IN350 Document Management and Information Steering

IN350 Document management and information steering

Week 1: What is a document?

Class 1: Introduction: changing document definition, document management in relation to knowledge management, what is the motivation of DM, the theories and applications

Class 2: Document Properties and Markup Languages: properties of a document (content, syntax, structure, presentation style, metadata), what is markup (procedural vs. descriptive), why move to xml

Week 2: Michael Spring lectures were most important. The Document Processing Revolution, relates to his article and notes. History, what is the document process matrix, how is the transition going, how is XML involved. Also, his notes, A first look at XML, is a good overview of the parts of XML.

Week 3: Text properties, processing, file organization and indexes. Zipf’s Law, Heap’s Law, Appendix A.

Week 4: eMarket Places and Information Handling. Network Supply Chains, order fulfilment, industrial networks. This lecture describes the business environment and forces. It does not specifically talk about documents.

Week 5: Text Operations and Search Enhancements: Preprocessing of text, and creating indexes. Huffman coding is an example of entropy encoding, and it depends on statistics. Lempel-Ziv is based on patterns in the content, it is an example of unversal encoding.

Week 6: Retrieval Evaluation Measures: retrieval system quality, effective measures of quality, precision, recall, etc.

Week 7: no lectures.

Week 8: Creating an Index, Types of Indicies. Based on Informix examples. Indexing parameters.

Week 9: Searching the Web: how to measure the size of the web, search engine features, how do they create indexes, how is ranking done, what are spiders, how can users improve searching?

Week 10: Two parts

Part 1: Transport of Multimedia Documents: Ketil Danielsen’s summary. What are the limitations and alternatives?

Part 2: Multimedia Management and Storage Medium: why is compression needed, types, lossy vs. lossless, important multimedia standards, storage alternatives.

Week 11: Data Warehousing: straight forward examples from the article collection, on concepts and use, star schema and snowflake schema, relate to Informix’s use of DWH technology, how it relates to the size of the index.

Week 12: Document Publishing and Distribution: changes in the publishing industry (markets, demands, technology, procedures). What are the changes, can you talk about each?

Week 13: B2B e-commerce standards for document exchange: refer to the old notes too. What is EDI, why was the old way difficult for SME, why is the new way also difficult? What are the examples of the standards and frameworks? What are the main features of the frameworks? What are the obstacles for SME that want to use XML based EDI solutions?