| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Projects

This version was saved 10 years, 9 months ago View current version     Page history
Saved by Jim Davies
on June 5, 2013 at 8:27:57 am
 
Visual Imagination Modeling
--------------------------
When I say I walked my dog this morning, you can picture what that looks like, even though I have told you nothing about where it was, what kind of dog I have, etc. People bring lots of knowledge to bear on their visual imaginations. How is this knowledge used to create imaginings?
This project's long-term goal is to create a computer-program model of human imagination.
In the short term, I am using the results of psychological experiments and data from image information to create a demo. This demo should be able to take simple inputs (e.g. a cat and a house) and create a visual image that has those elements in them, in a way that has some psychological plausibility. Visuo, Detectors, and the Image Oracle are sub-projects.
Detectors:
We have built several programs that report on spatial characteristics and relationships in peekaboom data (e.g., above, occluding). The detectors output a fuzzy belief value between 0 and 1, representing how true the agent believes the relationship to be in the input image. There are several directions we are going with this work:
     Project: Do psychological studies that compare them to human data.
     
Visuo:
Visuo is a cognitive model of estimation of quantitative magnitudes (e.g., height, size). It takes in data in the training phase. In the visualization phase it uses analogy to estimate magnitudes of things it has never experienced before. Some papers have been published on this work already.
Imagination Engine
    Integration with Image Oracle:
          The image oracle outputs distance and angle data that Visuo will take as input. (Sterling, Cesar, Jonathan)
    Integration with Detectors:
          Waiting on Quanty data, or the psychological tests of detectors. Then we can relate people's judgments of relationships to angle and distance and to our detectors. 
     Natural Language Interface:
           Trying to replicate the work of Wordseye, so that complex paragraphs can be turned into Visuo input (David Dodds)
     Part/Whole (Kae Bagg) 
          When the Engine puts in a picture of a woman, it might use the oracle to know that the nose label is also there. It would be a mistake, however, to put in another image, this one of a nose, because the image of the woman probably already has a nose. This is because nose and woman have a part/whole relationship. Get the engine to be smart about this by understanding part/whole relationships.
     Coherence:
          Right now the oracle of objects takes the top however many co-occurring labels. For a dog, though, it might have a leash and a sofa. We know that these two labels come from different environments and are unlikely to be seen together. Leash and sofa might co-occur with dog, but they probably don't co-occur with each other. Fix this.
     Contact vs. Average pixel: 
          Right now the engine uses the average pixel location to determine the location of an object. It does not, for example, take into account how big the object is (in pixels) as a result when we calculate the "distance" between two objects, it might look as though they are far away when in reality their pixels are overlapping. Give the engine a sense of how big the objects are so this doesn't happen. 
     Stitching:
          There is a technique in graphics know as photo-stitching. One can place two bits of photo onto a canvas, and there are algorithms that interpolate pixels in between so that the final product looks like it's from one photo. Here is one example: http://mi.eng.cam.ac.uk/research/projects/Query/   and another at http://cg.cs.tsinghua.edu.cn/montage/main.htm
          Get this working with the engine so that our collages don't look so bad. 
     Mine 3D environments for spatial relationships:
          We need a database of how objects tend to relate to each other in 3D. The Quanty game is one way to get at this, but another (thought of by Sterling Somers) is to mine 3d environments that have already been created. Assuming they are realistic, we can get information such as co-occurrence and angle/distance from them just as we did in 2D from Labelme and peekaboom.  Once we have this, we will have data to use to create new 3D environments. We already have some candidate programs that will mine this data from web3d. What we need is for someone to find or create a database of 3D environments so we can run the program on all of them and get a lot of information. 
     Clean Up Labelme:
          The labelme database has mistakes and things in it. I think we should crowdsource fixing it, perhaps using Amazon's Mechanical Turk. 
     Optimize/reprogram Visuo and run on all of the terms in labelme (Jonathan?):
Image Oracle: 
The Image Oracle has two functions:
     1) takes as input some words, and returns the probability of the top other labels likely to be in the same images.
     2) takes in some words a and some other word b  and returns how likely b is to be in the same image with a.
     Helping Computer Vision:
            The oracle will help object recognition systems make better guesses based on the context of other thing in the image. (Cesar)
 
The Lake Cognitive Architecture
-------------------------------
Cognitive models are computer programs made to imitate how people think. Many are special-purpose, for one cognitive task. My dissertation was a cognitive model. A cognitive architecture, in contrast, is a piece of software that you use to create models. So it's got a memory system, attention, etc., that 1) help you create specific models with it, and 2) constrain the model you create so it's forced to be realistic. There are only about 15 in the world.
Colleague Robert West runs the PythonACT-R system: http://sites.google.com/site/pythonactr/
 
 

 

Comments (0)

You don't have permission to comment on this page.