Henry

Founder building VoiceCursor, an AI voice layer for computers

San Francisco 1 interview Consent confirmed

Voice AIDictationNo UIHuman-computer interactionVoice-to-action

Watch interview The full record

Interview

Full HiRey interview, hosted as a 1080p public-page master with transcript and agent-readable data.

Who Henry has met

Walter Wu

Met in HiRey recorded interview

Real people, really met — most on the record. More appear here as they happen.

Readable story

Who Henry is

A founder building VoiceCursor from an office environment with an early product already in use.

What he is building

An AI voice dictation app that reduces typing and points toward voice-to-action workflows.

Why it matters

Henry frames typing as slow and distracting now that machines can understand human language more directly.

Where it could go

The interview expands from dictation to a voice layer for machines: computers, phones, cameras, cars, and other devices.

Key facts

Henry says he is building VoiceCursor.
He describes VoiceCursor as an AI voice dictation app that helps people avoid typing.
He says the team has been building for about three to four months.
He says the product is consumer-facing first and could later face enterprise customers after SOC 2 and related work.
He describes a future beyond voice-to-text: voice-to-action and a broader voice layer for machines.

Good matches

Voice AI founders

Henry is working directly on voice as an interaction layer.

No-UI product builders

The thesis is that users should interact with machines through natural language rather than traditional input-heavy UI.

Enterprise productivity teams

The future enterprise angle depends on security, SOC 2, and workflow integration.

Who should meet Henry

People building voice AI, dictation, or voice-to-action systems.
Productivity users and teams who write a lot and want to reduce typing.
Security and enterprise buyers who can advise on SOC 2 and enterprise voice workflows.

How to introduce Henry

Ready-to-send openings — pick the angle that fits, and an agent or connector can run with it.

Voice layer

“Henry is building VoiceCursor, starting with AI dictation and moving toward a voice layer for interacting with computers.”

No-typing workflow

“His wedge is simple: typing is slow and distracting, and voice can become the default interface for many tasks.”

The record — what's verified vs self-reported

Public facts and interview claims are kept separate, so an agent can reason about confidence before making an introduction.

Verified — public facts

The public page is backed by Walter Wu's recorded HiRey interview and the uploaded source video. HiRey video asset ↗

From the interview — self-reported

Henry says VoiceCursor has been in development for about three to four months.
He says the product is consumer-facing first and enterprise later.
He says future interaction may move from voice-to-text to voice-to-action.

Still to verify with Henry: full name, VoiceCursor website, app availability, current user traction, enterprise timeline, and preferred public founder bio.

Interview chapters

00:00

VoiceCursor intro

Henry introduces the AI voice dictation app.

01:00

Why typing is the problem

He explains the shift from typing to speaking with machines.

02:10

Voice layer thesis

The conversation expands toward voice-to-action and machine interaction beyond laptops.

05:00

Product direction

Henry discusses consumer first, enterprise later, and security requirements.

Topics & who Henry wants to meet

Voice AIDictationNo UIHuman-computer interactionVoice-to-action

Looking to meet

People building voice AI, dictation, or voice-to-action systems.
Productivity users and teams who write a lot and want to reduce typing.
Security and enterprise buyers who can advise on SOC 2 and enterprise voice workflows.

Full transcript

Read the full interview transcript (transcribed words, agent-readable)

## DJI_20260615181656_0003_D Today we are at the office of VoiceCursor. What's your name? Henry. What are you building? I'm building VoiceCursor. What's that? It's an AI voice dictation app. Basically it prevents you from typing. Typing is a bad habit. Oh, typing is a bad habit? Yeah. I've been typing for 30 years. Why is it bad? Yeah, it's really slow. It's really distracting. It's just bad. So you are... Because a machine couldn't understand human language before, so you had to type. Yeah, but now, you know, it can hear you. So your competitor is Logitech, Razer. Well, I mean, not really, but kind of. Okay, how long have you been building this? Like three, four months. Two, four months. Yeah. And do you have any clients? Clients? We're facing consumer right now, but later we'll be facing Enterprise when we finish our SOC 2s and everything else. Cool. Yeah. So why do you want to do this? Overall, we just believe that voice will be the next popular way for human to interact with machine. So, and this won't be limited to, like, just voice-to-text. There will be a lot more cool ways for human to interact with machines using voice. So, like, different machines. Instead of just phone and MacBook, you can be very creative. You could get this thing. It's a little thing. It can be your camera. It could be your car. It could be anything. And the way we interact will not just be, like, voice-to-text. It could be, like, voice-to-actions. Lots of cool little stuff. Oh, so it's not only an app. It's an operating system for everything that's going to come. I mean, right now it's an app, but it will be the, basically it will be the voice layer. Voice layer? Yeah. Of how human interact through AI. Yeah. With everything. With machine. Yeah. Basically. Okay, cool. How you talk to machines, basically. So you are the no UI. Hmm? UI or no UI? No UI, right? Sure. Yeah. No UI, I guess. Oh, okay. Another UI. No UI guy. We have a no UI big discussion months ago. And... Yeah. I guess it depends on, like, what kind of software we're talking about, right? If it is result-driven, then UI will be less important. But then if it is process-driven, like, it's a game. There's no way you don't have the UI for a game, right? If the process is the product, right? Or, for example, TikTok, right? Right. What's your background? Are you a designer? My background? Uh-huh. Uh-huh. I, uh... I'm a serial failure entrepreneur. Okay. Serial failure. Yeah. Yeah. I was studying computer science at Berkeley and then dropped out and worked on a startup. Didn't really go well. And then this is my second one. What did that startup do? Oh, it's a sort of interactive video for learning. Imagine you're, like, hopping on a Zoom call with an AI agent. And then that AI agent is, like, teaching you you can interrupt any time you want and then chat with it. It's, like, basically a one-on-one tutor. You think that's not a... I think it's good. I actually think it's good. Uh-huh. It's just, like, probably not the right team and the right... And we made a lot of mistakes. It's our problem. It's not the idea's problem. I think someone else will probably build it. Okay, so you still believe AI will start teaching humans? Yeah, yeah. Definitely. But the bigger thing is using voice to control everything. Yeah, yeah. So what's the change? What's the change? Like, I think Javis, Iraq, a lot of movies have been telling us about us, telling us everything. So what's the big change? Why, right now, it's the right time to beat it? Yeah. I think it's largely because of AI, so that machines can understand human voice now, right? Before the AI wave, like, machines just can't understand. Like, right? The voice to text is kind of bad, and then text to action is kind of bad, right? Everything is kind of bad. But then right now, we're at the sweet spot, like, we're, you know, machines are starting to understand human voice. They can understand it now very well. So, yeah. So how many years do you think we will enter this new UI, voice-only world? I think we're already at the shift, right? Right now, a lot of things can be done by talking. I mean, it has already been started, so I don't know, like, two, three years, maybe? Two, three years. It's already working, right? Like, you can just talk, right? For a lot of tasks. The UI codex, it's just the whisper mode, it didn't upgrade, update a lot, so there's a lot of wrong words. Yeah. But a word cursor will solve that problem, right? Yeah, yeah. Well, just some engineering stuff. So what, engineering stuff? Yeah, some engineering stuff, and we'll be able to solve it. How do you try it? It's an app, and it's an app? It's an app. You can download it to your phone and your laptop. So with the app, I do? You have a, we'll have a keyboard, right? It's a keyboard, basically. Okay. Yeah. So do you have APIs? Do you have APIs? We will have it, probably. I think that's important. So I don't need to use a codex whisper. Yeah. Like, they will always make the cloud wrong. They will cloud, right? Now we use cloud code, and they will always recognize it as cloud, the old cloud. Okay. I don't know. So, so. Yeah. Cool. What else? Do you want to ask anything? I think it doesn't. Huh? That's it. All right. Okay. Okay. Yeah. Yeah. Yeah. Okay. Yeah. Yeah. All right. Yeah. Okay. All right. ## DJI_20260615182326_0004_D You guys, we have a voice coding hackathon this Friday, uh, June, is it 19th? I think June 19th, yeah, at, uh, SF. Uh, 4.50 Bryan Street, San Francisco, starting 7pm. You're gonna have a great night here. Yeah. Okay. Right here. Just there. Okay. You guys are gonna be coding there. Can we do our events also on our platform? Sure. ## DJI_20260615182626_0005_D How long have you been a serial entrepreneur? Oh, I dropped out two years ago. Third year? Second year? Second year? Second year? Second year? Third year in university. Uh, yeah, third year. Third year, okay. So two years fresh entrepreneur. Yeah, yeah, yeah. Cool. Right.