290-03 : Web Architecture

Administrivia

Teaching Team 

Professor Erik Wilde

Email: dret@berkeley.edu

Website: http://dret.net/netdret/

Office number: +1-510-6432253

TA Igor Pesenson

Course Description

This course is a broad survey of Web-based publishing, defined here as any well-designed service for providing information using Web formats and protocols. It touches on strategy and project planning considerations, but emphasizes design, implementation, and delivery issues. Design topics include publishing process modeling and document workflows, content reuse, document formats, compound documents, internationalization and localization, and the associated questions of usability and accessibility. Implementation issues include URI design, Web server setup, and storage management, starting from the foundation (XML databases) and moving on to specialized content management systems. Delivery issues include cross-media publishing and syndication alternatives such as RSS and Atom.

Course Information

SIMS INFO 290-03

Course Dates: August 28 to December 6, 2007

Lecture Schedule: Tuesday 9:00am-10:30am Thursday 9:00am-10:30am in 202 South Hall

Units: 3

Grading Option: Letter Grade only

Course Work

August 28 : Tuesday

Overview and Introduction 

This introductory lecture gives the motivation for the course, some information about the people involved and the organization of the course, a high-level overview of the course's topics, and an overview of the assignments which are an important part of the course program.

Required Readings

Resources

August 30 : Thursday

Architecture of the World Wide Web 

The Web's architecture has very simple principles revolving around the ideas of placing a heavy emphasis on a consistent and global identification mechanism for resources, a standardized way of how resource representations can be retrieved, and a standardized way of how resource representations should be usable by using standardized media types. This lecture presents an overview of these architectural principles and illustrates them with using blogs as an example of Web-based applications.

Required Readings

Resources

Lecture Notes

Additional Resources

September 4 : Tuesday

Internet Foundations 

The Internet is the technical infrastructure on top of which the Web is built. Some of the services provided by the Internet are essential for the Web, most importantly the naming service and the data transfer service. The Domain Name System (DNS) provides the human-readable names for computers, which can then be used in the addresses of Web servers and ultimately Web pages. The Transmission Control Protocol (TCP) provides the reliable data transfer service between Web Servers and Web Browsers, building on the very robust Internet Protocol (IP).

Required Readings

Required

Timeline

Resources

Lecture Notes

Additional Resources

September 6 : Thursday

Web Foundations (URI & HTTP) 

The Web assumes an underlying network infrastructure providing a reliable, connection-oriented, flow-controlled, end-to-end transport service. Based on such a network service (today provided by the Internet), the Web's transport protocol moves representations of resources identified by a Uniform Resource Identifier (URI) between Web servers and clients. The most important protocols for data transfer on the Web is the Hypertext Transfer Protocol (HTTP).

Required Readings

Required

Cool URIs

Resources

Lecture Notes

Additional Resources

September 11 : Tuesday

Security Issues 

TCP and thus HTTP are clear-text protocols, which make no attempt to hide the data being transmitted. For secure data transfers, it thus is necessary to use additional technologies for providing secure data transfers. This lecture looks briefly into the foundations of cryptographic primitives (such as one-way functions and encryption) and cryptographic protocols. For the Web, the most interesting security feature are secure HTTP interactions, which are provided by HTTP over SSL (HTTPS), a protocol that layers an encryption layer (SSL or TLS) between TCP and HTTP.

Resources

Lecture Notes

Additional Resources

September 13 : Thursday

Identity and Authentication 

For any task involving personalization and/or trust, it is not only necessary to have a concept for providing privacy, but also to have concepts for identity and how to prove identity, which needs authentication. HTTP has built-in mechanisms for authentication, and the standard HTTP Authentication mechanisms are Basic Authentication and Digest Access Authentication. Instead of these mechanisms, many applications implement their own ways of authentication, which often are based around authentication using HTML Forms.

Resources

September 18 : Tuesday

State Management 

HTTP is a stateless protocol, where each request/response interaction is a separate interaction and there is no protocol support for longer sessions (such as a user logging in and working on a Web site as an identified user). State management refers to mechanisms which provide support for this kind of scenario, the most popular choice for state management are cookies. Another possibility is URI-based state management. This lecture is a first glimpse into the world of Representational State Transfer (REST), the Web's fundamental model of handling interaction with resources.

Required Readings

Required

Wikipedia

Resources

Lecture Notes

Additional Resources

September 20 : Thursday

Representational State Transfer (REST) 

Representational State Transfer (REST) is an architectural style for building distributed systems. The Web is an example for such a system. REST-style applications can be built using a wide variety of technologies. REST's main principles are those of resource-oriented states and functionalities, the idea of a unique way of identifying resources, and the idea of how operations on these resources are defined in terms of a single protocol for interacting with resources. REST-oriented system design leads to systems which are open, scalable, extensible, and easy to understand.

Required Readings

Resources

Lecture Notes

Additional Resources

September 25 : Tuesday

Character Set Issues & Unicode 

Every character-based document is based on some model of which characters are available, and how they are encoded. Unicode is the most popular character set today and provides a variety of encoding schemes, each of them being a Unicode Transformation Format (UTF). In addition to character sets and encodings, other issues relevant when dealing with characters are transcoding and normalization, which deal with the problems arising when using different character encodings or different encodings of particular characters.

Resources

Lecture Notes

Additional Resources

September 27 : Thursday

Multipurpose Internet Mail Extensions (MIME) Types 

One of the most important aspect of computer-based communications is the concept of media types, the question what type of information some digital artifact represents, and how it is encoded. The most common standard for this information is the scheme introduced by Multipurpose Internet Mail Extensions (MIME). Media types can be negotiated by peers communicating through HTTP. Some media types allow fragment identifiers, which allow references to a resource to identify a fragment of the complete resource.

Required Readings

Required

MIME Respect

Resources

Lecture Notes

Additional Resources

October 2 : Tuesday

Hypertext Markup Language (HTML) 

The Hypertext Markup Language (HTML) is the most important content type on the Web. Even though it is primarily intended for humans (by presenting formatted pages of textual content), it also has facets that are important for machine-based processing. HTML can be use in a variety of ways, and this lecture looks at some of the important rules that should be observed when creating HTML, for example how to use HTML markup in general and how to create accessible forms.

Resources

Lecture Notes

Additional Resources

October 4 : Thursday

Cascading Style Sheets (CSS) 

Cascading Stylesheets (CSS) have been designed as a language for better separating presentation-specific issues from the structuring of documents as provided by HTML. CSS uses a simple model of selectors and declarations. Selectors specify to which elements of a document a set of declarations (each being a value assigned to a property) apply; in addition there is a model of how property values are inherited and cascaded. The biggest limitation of CSS is that it cannot change the structure of the displayed document.

Required Readings

Resources

Lecture Notes

Additional Resources

October 9 : Tuesday

Usability 

According to the International Organization for Standardization (ISO), usability defines the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use. We will discuss tradeoffs in the design of Web interfaces to support users goals, and present resources to aid design decisions.

Required Readings

Resources

Lecture Notes

Additional Resources

October 11 : Thursday

Accessibility 

Web accessibility refers to the degree to which the Web can be used and accessed by people with disabilities. The World Wide Web Consortium (W3C) determines that accessibility specifically involves how people with disabilities can perceive, understand, navigate, interact with, as well contribute to the Web. Techniques for supporting these needs will be discussed and reviewed through examples and exercises.

Required Readings

Resources

Lecture Notes

Additional Resources

October 16 : Tuesday

Dynamic HTML (DHTML) 

Dynamic HTML (DHTML) refers to the combination of HTML, CSS, the Document Object Model (DOM), and JavaScript. The Document Object Model (DOM) is an API for markup-structured documents, it is used for HTML as well as for XML. The DOM allows program code to access documents for read and write access. DOM-based access to documents in conjunction with user-originated events (keyboard and mouse events) allows scripting code on Web pages to dynamically update documents.

Resources

Lecture Notes

Additional Resources

October 18 : Thursday

Asynchronous JavaScript and XML (Ajax) 

Asynchronous JavaScript and XML (Ajax) takes DHTML to the next level by allowing server access from within scripting code. This is accomplished by using a standardized API for client/server communications, the XMLHttpRequest object. This objects allows using HTTP connections from within scripting code, and thereby allows scripting code to dynamically reload data from a server in response to user interactions.

Resources

Lecture Notes

Additional Resources

October 23 : Tuesday

Internationalization (I18N) & Localization (L10N) 

Many publishing environments need to support multiple languages. In many cases, the requirement to support multiple languages surfaces in later stages of a product development or publishing solution, which can cause major design changes, driving up costs. Internationalization (I18N) is the approach to design systems which can adapt to different locales. Localization (L10N) is the activity to identify, define, and encode locales, based on internationalized software.

Required Readings

Resources

Lecture Notes

Additional Resources

October 25 : Thursday

Picture Formats 

Pictures are the only multimedia content on the Web that is widely supported by standardized formats. The most important picture formats are the Graphics Interchange Format, the Joint Photographic Experts Group (JPEG) format, and the Portable Network Graphics (PNG) format. These picture formats target different application areas and depending on the picture material, choosing one format over the other can make a big difference.

Resources

Lecture Notes

Additional Resources

GIF · JPEG · PNG

October 30 : Tuesday

Content Syndication 

For many information sources on the Web, it is useful to have some standardized way of subscribing to information updates. Syndication formats such as RSS and Atom can be used by these information sources to publish a feed of updated information items. While RSS and Atom are read-only formats, the Atom Publishing Protocol (AtomPub) build on top of Atom and provides a protocol for submitting new items to feeds.

Required Readings

Resources

Lecture Notes

Additional Resources

November 1 : Thursday

Semantic Web 

HTML pages are for human users and describe a resource in very general terms (headings, lists, tables, …). For machine-based interaction, it is often necessary to have more information about the application concepts. XML is a popular language for representing application structures, but is targeted at machine-based processing alone. Microformats and more formal approaches such as the Resource Description Format (RDF), RDF in Attributes (RDFa), and Web Ontology Language (OWL) often are used to describe Web content semantically.

Resources

Lecture Notes

Additional Resources

Microformats · RDFa · FAQ · RDF · OWL

November 6 : Tuesday

Implementation Variants 

Today's landscape of Internet and Web technologies offers a sometimes confusingly wide array of implementation choices. Given some application idea, implementation can be done using basic Web technologies, newer Web 2.0 technologies, it can use browser-embedded functionality such as Flash, Java Applets, ActiveX, Silverlight, or Google Gears, or it can be built with Web-oriented application development platforms such as Adobe Integrated Runtime (AIR) or JavaFX.

Resources

November 8 : Thursday

SWOT Analysis 

Starting from a desired objective (such as the successful implementation of a well-designed Web app), it can be very informative to assess factors influencing the pursuit of this objective. One way to do it is the analysis of the Strengths, Weaknesses, Opportunities, and Threats (SWOT) of implementation variants, which supports a more structured way of comparing variants, and can be a starting point for choosing the best one.

Resources

last updated on Fall 2007 by dret