Thanks to the World Wide Web and Java, we now have the opportunity to disseminate information in a way that allows the viewer to actively manipulate facts and concepts. This can lead to ubiquitous "smart" documents that can allow many people to cooperatively share information. How far can we go in blurring this distinction between documents and programs? More specifically, what is the point at which programs and documents begin to merge seamlessly together? We give some examples of what has happened in this area and where the world may be going.
Ultimately, success in this unification will be dependent not mainly on technology (which constantly changes) but rather on the fixed properties of people - how we learn, how we communicate, and what sorts of interfaces will help us use our remarkable human powers of intuition and insight, rather than locking us in to preexisting modes of thought. For these reasons, it is a good idea to temper long-term goals and wish-lists with short term milestones and experiments.
In this position paper, we lay out some long term goals for how the paradigm of "on-line interactive program" and "on-line shared interactive document" might be merged. Then we lay out some short term milestone that seem to be reasonable steps along the way to those goals, given current knowledge. Finally, we show some concrete experiments that we have done to work toward those goals and milestones, and briefly analyse the results of these experiments.
Leveraging. Our target applications for future documents are high-information activities (e.g business, education, health care, research). These activities require that we make full use of human communicative power, and that we optimally harness the knowledge of individuals who can provide complex information. In order to meet these goals, new information tools must leverage expert knowledge to help non-expert users create powerful program-documents. The effectiveness of this leveraging approach is today demonstrated by point-and-shoot cameras and desktop publishing programs. In each case expert knowledge becomes a part of an information tool, so that it can be used invisibly by a user who may know nothing about the underlying complications - such as focal lengths, ligatures, or real-time responsive animations.
Limitations to Address. The new generation of tools, as well as the program-documents created through them, must address a number of deep problems with the communication tools currently at our disposal. For instance, while humans perceive and create information using multiple sensory and expressive modalities, current tools for creating and disseminating information often make use of only one or two modalities (e.g., only text or speech). At this point, electronic communication tools that closely integrate multiple, simultaneous media forms (e.g. text, speech, animation, and gesture) are the special province of experts, not average citizens. Further, even the most advanced communication forms available lack any ability to adapt to the user, and do not facilitate the active role that cognitive scientists and educators know to be necessary for effective communication and learning. Finally, today's tools are often unavailable when and where they are needed, due to lack of mobility and/or high cost.
All of the examples shown are Java 1.01 applets. This lowest-common-denominator platform was chosen because it allows the broadest available flexibility in making on-line experiments simultaneously available to large numbers of users.
The figures above are from an interactive applet designed to investigate the question "how simple can we make it for people to design customized animated characters?" The "character" on the left actually contains seven interactive zones. Dragging the mouse within any of these zones causes certain regions of the body to change shape so as to follow the mouse movement. Certain parts are coupled together, so that dragging one always causes the other to change as well.
We can view this as a very simple example of program-document leveraging on multiple layers. The base substrate provides the potential for deformable shapes and for deformations to drive other parameters (including other deformations, and perhaps modes of action, or even personality traits). The example body is simple enough to be designed or customized by a user with nearly any level of drawing expertise. Expert-created rules can then be applied, which might be stored in a series of humanoid rule sets (for deformations and correspondences) between which the individual author/user can select and blend.
Questions for the workshop:
In the case of animation design, layers of expert-created rules can provide a non-expert user with individual movements to include or overall stylistic guidance. For each character, this work can happen on the level of direct manipulation of the character and limited parameters. The above figure was created using a prototype applet that allows an animator to work on a higher level of "attitude" control, somewhat as though the graphical puppet were a human actor being given motivational instructions.
Yet for more precise or unexpected control, or for the purposes of a content-creating expert, a more detailed representation may be required that is both the document explaining the layered actions, but also the interface through which these actions may be manipulated.
Questions for the workshop:
In our improvisational animation work, we have shown how to make an embodied agent react with responsive facial expression, without using repetitive prebuilt animations, and how to mix those facial expressions to simulate shifting moods and attitudes. The result is real-time interactive facial animation with convincing emotive expressiveness.
The eventual goal of this research is to give computer-mediated documents the ability to represent the subtleties we take for granted in face to face communication, so that they can function as agents for an emotional point of view.
In an experiment to discover a viable vocabulary for such a capability, we isolated the minimal number of facial expression elements that will produce a "convincing" impression of character and personality. Of course this is just a subset of the full range of expression the human face is capable of. Paul Ekman's pioneering work on the Facial Action Coding System gives a functional description of the full range of expression of which the human face is capable.
This was also an experiment to test whether it was feasible to implement a 3D character in Java without using any 3D plug-ins (ie: doing the 3D rendering entirely in the Java applet itself), which would allow us to conduct widely-distributed user studies using web technology already in place.
Questions for the workshop:
Documents are typically divided into nested semantic levels. Summary information within an introduction or section heading expands into detailed information at a later section. This is also generally true of large interaction tasks. We are trying to ask the question of how to create interactive control of information that makes the best use of peoples' intuitive and cultural notions of this nesting.
The above sequence of images shows a number of time-sequential snapshots from a user interaction with a zoomable multi-scale user interface for controlling and editing animation. The user zooms into the controls for a particular animated character, and chooses an animatable to edit. The user can then build nested expressions of animation sliders and key-frame curves, in order to provide various controls for the animation.
Questions for the workshop:
The above images are a set of progressively rendered versions of a synthetic planet created entirely out of procedural texture synthesis algorithms. The use of an entirely procedural planet generator is allowing us to do work in which we let users "steer" the evolution or appearance of a planet by using high level controls that directly impact the aesthetic parameters of planet evolution and appearance.
For example, the user can continuously vary the ratio of ocean area to land mass area. By giving the user a number of such controls, we can begin to conduct experiments on user experience of higher level control of procedurally and aesthetically defined artifacts in a highly shared computer-mediated document.
Questions for the workshop:
A summary of our general thesis might be stated as follows: that (i) the layering approach is effective; that (ii) it is a worthwhile goal is to blur the distinction between programs and documents; that (iii) this blending, when done on the web, is effective for collaboration and information sharing, and that (iii) breakthroughs will come from making accessible tools that can be used by creative people in unexpected ways.
N. Badler, B. Barsky, D. Zeltzer, Making Them Move: Mechanics,
Control, and Animation of Articulated Figures
Morgan Kaufmann Publishers, San Mateo, CA, 1991.
N. Badler, C. Phillips, B. Webber, Simulating Humans: Computer Graphics, Animation, and Control Oxford University Press, 1993.
J. Bates, A. Loyall, W. Reilly, Integrating Reactivity, Goals and Emotions in a Broad Agent, Proceedings of the 14th Annual Conference of the Cognitive Science Society, Indiana, July 1992.
B. Bederson, J. Hollan, K. Perlin, J. Meyer, D. Bacon, and G. Furnas, Pad++: A Zoomable Graphical Sketchpad for Exploring Alternate Interface Physics, Journal of Visual Languages and Computing (7), 3-31. 1996.
B. Blumberg, T. Galyean, Multi-Level Direction of Autonomous
Creatures for Real-Time Virtual Environments
Computer Graphics (SIGGRAPH '95 Proceedings), 30(3):47--54, 1995.
A. Broderlin, L. Williams, Motion Signal Processing, Computer Graphics (SIGGRAPH '95 Proceedings), 30(3):97--104, 1995.
R. Brooks. A Robust Layered Control for a Mobile Robot , IEEE Journal of Robotics and Automation, 2(1):14--23, 1986.
J. Chadwick, D. Haumann, R. Parent, Layered construction for deformable animated characters . Computer Graphics (SIGGRAPH '89 Proceedings), 23(3):243--252, 1989.
D. Ebert and et. al., Texturing and Modeling, A Procedural Approach Academic Press, London, 1994.
P. Ekman, Facial expression of emotion. American Psychologist, 48, 384-392.
M. Girard, A. Maciejewski, Computational modeling for the computer
animation of legged figures
. Computer Graphics (SIGGRAPH '85 Proceedings), 20(3):263--270, 1985.
B., Hayes-Roth, and R., van Gent, Improvisational puppets, actors, and avatars, in Proceedings of the Computer Game Developers' Conference, Santa Clara, CA, 1996.
J. Hodgins, W. Wooten, D. Brogan, J O'Brien, Animating Human
Computer Graphics (SIGGRAPH '95 Proceedings), 30(3):71--78, 1995.
M. Johnson, WavesWorld: PhD Thesis, A Testbed for Three Dimensional Semi-Autonomous Animated Characters , MIT, 1994.
M. Karaul, personal communication
P. Maes, T. Darrell and B. Blumberg, The Alive System: Full Body Interaction with Autonomous Agents in Computer Animation'95 Conference, Switzerland, April 1995 .IEEE Press, pages 11-18.
B. Mandelbrot, The Fractal Geometry of Nature, W. H. Freeman and Co., 1983.
M. Minsky, Society of Mind
, MIT press, 1986.
C. Morawetz, T. Calvert, Goal-directed human animation of multiple movements . Proc. Graphics Interface}, pages 60--67, 1990.
S. Mukherjea, J. Foley, and S. Hudson, Visualizing Complex Hypermedia Networks through Multiple Hierarchical Views, Proceedings of CHI'95, ACM press, 331-337. 1995.
F. K. Musgrave, Methods for Realistic Landscape Imaging, Ph.D. Thesis, Dept. of Computer Science, Yale University, 1994.
K. Perlin, An image synthesizer
. Computer Graphics (SIGGRAPH '85 Proceedings)}, 19(3):287--293, 1985.
K. Perlin, Danse interactif . SIGGRAPH '94 Electronic Theatre, Orlando.
K. Perlin, Real Time Responsive Animation with Personality , IEEE Transactions on Visualization and Computer Graphics, 1(1), 1995.
K. Perlin, A. Goldberg, The Improv System Technical Report NYU Department of Computer Science, 1996.
(online at http://www.mrl.nyu.edu/improv)
K. Perlin, Hoffert, E., Hypertexture, Computer Graphics, Vol 19, No. 3, 1989.
K. Perlin, "An Image Synthesizer, Computer Graphics, Vol 15, No. 3, 1985.
K. Perlin, A. Goldberg, Sid and the Penguins, SIGGRAPH '98 Electronic Theatre, Orlando. 1998.
K. Perlin, Layered Compositing of Facial Expression, SIGGRAPH '97 Technical Sketch, Los Angeles. 1997.
K. Perlin, A. Goldberg, Improv: A System for Scripting Interactive Actors in Virtual Worlds, Computer Graphics; Vol. 29 No. 3. 1996.
K. Perlin and D. Fox, Pad: An Alternative Approach to the Computer
Interface, Proceedings of SIGGRAPH'93, ACM Press, 57-64. 1993.
K. Sims, Evolving virtual creatures
. Computer Graphics (SIGGRAPH '94 Proceedings)}, 28(3):15--22, 1994.
N. Stephenson, Snow Crash Bantam Doubleday, New York, 1992.
S. Strassman, Desktop Theater: Automatic Generation of Expresssive Animation, PhD thesis, MIT Media Lab, June 1991
(online at http://www.method.com/straz/straz-phd.pdf)
D. Terzopoulos, X. Tu, and R. Grzesczuk Artificial Fishes: Autonomous Locomotion, Perception, Behavior, and Learning in a Simulated Physical World, Artificial Life, 1(4):327-351, 1994.
R. P. Voss, "Fractal Forgeries" in Fundamental Algorithms for Computer Graphics, R. A. Earnshaw, ed, Spinger-Verlag, 1985.
A. Witkin, Z. Popovic, Motion Warping Computer Graphics (SIGGRAPH '95 Proceedings), 30(3):105-108, 1995.