[Dennis] is on YouTube with his channel “Made By Dennis,” but for the record he is a maker, not a V-tuber. On the other hand, his latest project– creating a profesisonal-level tracking rig with DIY IR cameras and a whole lot of moxie–does mean he’s now equipped to make the move to the prestigious, high-status world of pretending to be an anime girl.
That is of course not why he did it. Like most projects around here, the motivation was more a case of “I wonder if I can…”– in this case [Dennis] wondered what it would take for him to pull off the same sort of optical motion capture, or MoCap, that is used in Hollywood studios. Optical mocap has the advantage of being very precise, able to track things at high speeds, and not being in any way limited to the human form like the slew of AI-assisted methods hitting the market right now. The disatvantage is that you need to place markers on any part of your subject you want tracked, film them from all angles, and process a whole lot of pixels. In [Dennis]’s case, it ended up being about four billion. Keeping in mind that actually locating those points in 3D space is dependent on knowing exactly where your cameras are: if you want sub-millimeter precision, your cameras need to be fixed with sub-millimeter tolerance. It’s a big project, hence a long video, which is embedded below.
The DIY cameras use a AR0234 MIPI camera on a custom PCB with M12 lenses and IR filters. To improve the signal-to-noise ratio on optical MoCap, it’s standard to use near-IR light. The camera boards, as you might expect given the MIPI interface, hook into Raspberry Pi compute modules– the cheapest CM4 should work, though he’s using CM5s. The compute modules sit on custom boards that provide PoE, and some other niceties– like a small microcontroller driven by the pulse-per-second pin to help trigger the cameras in sync.
Each camera gets a ring light of near-IR LEDs that pulse at 160 W, which would be way more than PoE is specced to provide, but since the LEDs are only on when the camera is taking a frame, the average power is well within allowable limits. With 16 cameras each having their own ring light, that’s a lot of near-IR photons. Don’t forget your safety squints!
Rather than process the images with OpenCV, he has his own custom solution optimized for this use-case that [Dennis] reports is 300x faster. Luckily, he’s put his implementation on GitHub, along with the rest of the project. Even if you don’t have any v-tubing ambitions, this project is very impressive and worth checking out in its entirety.
Optical MoCap isn’t the only game in town, of course. If you want to do this cheap and easy, you can strap a bunch of IMU sensors to yourself– just don’t expect the same precision.
Thanks to [Dennis] for the tip!








