[comp] Painting reality with the reality painter

30 Apr, 2022

I briefly got hung up on a "good name" which is beyond stupid but also beyond my ability to control sometimes. ("hamburger...earmuffs!")

Let me describe my terrific idea instead.

You have a gizmo. It consists of a video camera, a tiny video projector and a bunch of embedded smarts.

The camera and projector are pointing in the same direction and mounted such that they will keep pointing the same directing. That is, they are "potted" or bolted to steel or something. You shouldn't be able to jostle the device and get the cam and projector pointing in different directions. I mention this because its important and the system will count on this.

If you are the type that needs something concrete to picture in your mind, let's say that it's a cylinder that is 10cm in diameter and 25cm long. Like the size of a really big flashlight. A Magalight.

On one end of this cylinder are two small holes: one for the camera and one for the projector.

Remember the first time you picked up a Project-o-Stick??

Here is your experience as a user. You've got it loaded up with some kind of spherical image. To keep it simple, let's just say that it's a spherical image of an interesting patter: colorful polkadots. Maybe something that would be about right for a toddler's wallpaper. The specifics of the image isn't important, but this is simple and easy to communicate: a spherical digital image of polka dots; nice and round people-pleasing polka dots...on white, say. Like Wonder Bread!

You push the button and are amazed to see the polkadot pattern projected onto the wall "perfectly", that is, all the curves and edges and reflectivity and color and albedo and whatever else is corrected for perfectly by this thing, somehow. What you see, more or less, is a great sphere of polkadots that is invisible until you shine the magic flashlight, at which point you can "see" the great polkadot sphere wherever you point the device. Even in the corners, even where the mantle is supposed to be, even where the wall is painted different colors, even where aunt Gladys' portrait is supposed to be...it's a perfect polkadot sphere and you are standing at its center.

"Remarkable..."

"How does it work?", you may ask. Well, not having to solder or program (or think through) anything, I happen to have one in my head that I can disassemble and show you its features.

The general idea

Between the frames, or in the place of frames, a patten is displayed instead of the image. It's just a regular (in the uniform, repeating sense) pattern. A good catch-all might be a beehive: just fine lines tracing out connected hexagons, aka a "beehive pattern". But importantly, it just displays lines, as fine and clear is we can manage; just one or two bright pixels wide.

The camera will see this beehive pattern projected onto something, say the walls of your living room. The system "knows" that it's projecting a beehive. It is tasked with calculating what changes would need to be made to the pattern to straighten out all the lines and to have the camera see the beehive pattern, corrected for all the bumps and angles.

Having done that, it should be able to construct a general model for how to distort any image for that given environment (position and direction of device...the wall in question, etc.) So now, when the next frame comes, it can apply that model (or filter) to the image, resulting a "flat" image where all the bumps and corners are made mostly invisible from the perspective of the camera (the human eyeball is assumed to be sufficiently nearby the camera).

"So that's how it works, right? It does that every frame! Pretty neat!", you are now saying. There is more! Wait!

This is almost right. This is the basic idea, but it can be improved upon.

First of all: colors. Any given thing will reflect R, G, and B at different levels. We do the above but using all three of those colors. There should be some compromise between having three beehive patterns projected in different colors (quickly!) one after the other, and having (for example) a random choice of colors for the edges of the beehive. With the latter, we could just project one pattern, but we'd lose some resolution with regard to the color data.

As we now have ML technologies, the system could vary among a few strategies like this to "optimize"; say for example it realizes after projecting just a few patterns that everything's the same color!

In the simple crude version, where we have hardware of infinite ability, and again for the sake of simplification, let's go ahead and project three beehive images of R, G and B, but as noted we can improve upon this.

We also might want to probe different intensities. Some objects might behave differently depending on "dim green" vs "bright green" (of the same color of green; I am talking luminosity.)

There are probably lots of tweaks to be made, even outside of color and luminosity; further reckoning required.

Inevitable pattern flashing artifacts

It's important to remember that we have to balance the brightness and duration of these patterns: If they are too bright, the viewer will see them, or worse notice them. If they are too dim, we will have to worry more about noise, and the sensitivity of the camera.

Thankfully we have theoretical hardware that performs just this side of the limits of physical reality, so what we can do is flash the pattern as bright as we want, but for a mere quantum of time. Picoseconds! Femtoseconds! ...whatever is required. I think the human brain (if not the eyeball) works more on averages and areas under the curve, so this bright, brief flashing of a pattern (hopefully, within some limits) can be perceived as dim (less luminous.)

And bonus: if you are roughly using RGB in equal quantities, whatever is perceived will be white (ish).

Might also cause seizures. Donno.

The camera will need to know exactly when to expect this, or at least: the camera will need to be able to perceive this pattern. The patter might be distinct enough (or maybe have some glaring cues built in) for the camera to just look through its frames and find it. It could have a time stamp or frame number built in using some lite steganography.

This sounds like, and probably is a lot of effort to get working, but it seems within the capabilities of OTS hardware. A 30th or even a 60th of a second is a long time for a MCU.

If you're lazy or scared, there are plenty of in-between baby steps and even the possibility of having v1 "flicker annoyingly" a bit.

How the simplest (boring) version might work

Mount the device to something sturdy. We won't assume the handheld model right off the bat. Point it at something that isn't too challenging, maybe the corner of a room where only two walls are visible (nice, smooth, satin finish walls).

Use a simple straight horizontal line as the pattern. Since we're pointing at the corner of the room, the horizontal line needs to be above or below the level of the camera, otherwise we'll just have a straight line and no correction to do.

The image to be projected (after correction) can also just be a "straight line" in the same place, orientation, etc. And the "frames" here can be minutes apart if you want; we're only getting started.

Project an "actual" straight line
Perceive the crooked line as it crosses corners and bumps
Calculate furiously
Try to project a "straight" line, as viewed by the camera

One could work up from there.

Oops

I might have lost track of something important: The only reasonable way this dumb system can straighten out these lines is by progressive refinement. This is not something that can be (easily) calculated if only using one camera (hmmm.)

What I had in mind was an algorithm that can be used to progressively straighten the lines. Start with trial and error. Guide with good design. Hasten with fancy algorithms. Maybe you can get down to just, oh, five frames. That's still a lot and thinking needs to go on in between those frames.

Assuming that the "scene" doesn't change is probably something that will prevent some madness and focus the mind a bit. The "dynamic" variant is still possible, but if it spent 5 minutes measuring the room and then did a great job of perfectly painting on an image, few would complain.

Extra Credit Fancy Stuff

And you were there!

If it's to be permanently mounted somehow, the device could be carried around a room before it's mounted so that it can perceive its environment (using the pattern, no image to project yet) and keep that data handy for future use.

So long as nothing changes in the room, all of the "correction" data can be calculated for a range of points throughout the room. When a person walks into the room, machine learning mumbo jumbo can be used to identify the position of that person's eyeballs in space. You can then ask the device to display the spherical image corrected for the viewer as he walks around and admires.

Edison is a bitch

I the device can be made into a hexagonal section of a sphere, they can then be teamed together to form a "ball" (project+camera facing out), covering all directions and therefore potentially covering the whole room with the spherical image.

If the ball gets small enough and cheap enough, it can SCREW INTO A LIGHT FIXTURE and be actuated by a light switch! (hm... this gives me an idea 💡)

Heat Vision

It occurs to me that the right calibration pattern might just be: a bunch of random pixels. That's got a few nice plusses:

If it is "truly" random, you can pick up additional information with every pattern.
It's simple!

But maybe a problem with random pixels: How do you know if this green pixel is different from the one a few inches away? Which is which? Perhaps not pixels, but maybe random circles with of random color? Or maybe distinct shapes instead of circles: otherwise we'd still possibly have the problem of which-is-which.