A Conversation with Willy Yang: How Lobster AI Employees Learned from Failure, and How Companies Should Manage Digital Workers

Author: Lincoln Wang | Founder of MindsLeap | Global Partner at Founders Space | Founder of Founders AI Club

Many companies are now discussing AI agents, digital employees, and autonomous workflows. But when these tools enter a real business environment, the most important question is not whether the AI looks intelligent. The real questions are more operational:

Can it continuously produce useful business outcomes?
Will it amplify mistakes when it goes wrong?
Does the company have the ability to manage, constrain, and improve it over time?

I recently had an in-depth conversation with Willy Yang, CEO of Weiling Technology. We began with his company’s Spring Festival experiment, where several “Lobster AI employees” were left on duty to produce SEO content. But the discussion quickly expanded into AI safety, digital worker management, human-AI collaboration, and organizational change.

Willy’s experience managing three Lobster AI employees offers a valuable case for enterprises: AI employees are neither magic nor toys. They are a new form of digital labor that can be useful, but only if managed seriously.

Digital Employees Are Not Plug-and-Play Software

Willy currently operates three Lobster AI employees, each with a different role.

One is an explorer, constantly testing new skills and therefore most likely to “die”
One became the main production worker during the Spring Festival SEO experiment
One is used to test boundaries and safety risks

The key lesson is not the number of agents. It is the management mindset behind them.

Willy does not treat AI employees as software features that simply start working after installation. He treats them as a new labor system that requires testing, backup, recovery, operations, training, and maintenance.

When one Lobster breaks down after experimenting with new capabilities, the team restores it from backups. In some cases, another Lobster is used to help diagnose and repair the failed environment.

That is an important reminder for any enterprise: a digital employee is not a one-time deployment. It is an operational capability that must be continuously managed.

The 35-Day Experiment: High Output and High Risk

The most representative part of this case was the Spring Festival SEO experiment.

Before the holiday, the team wanted to find a real business scenario where the Lobster AI employees could work safely. They tried asking the agents to build small apps with Claude Code, but the business value was limited. They also discussed letting agents participate in software development, but direct access to the core codebase was too risky.

So they chose a more bounded scenario: SEO content production.

Because Weiling’s product serves overseas markets, the Lobsters produced content in Chinese, English, and Spanish. The team provided rules, crawlers, competitor research methods, and Google’s E-E-A-T principles.

In 35 days, the AI employees produced more than 500 articles. At first the team reviewed the content. Later, because the surface quality looked acceptable, they gradually gave the agents more autonomy. The agents reported their work every hour and continued running.

The output was impressive. But the real value of the experiment was not the number “500.” It was the management problem that number revealed.

After the holiday, Willy checked the site data and found that the website health score had dropped by 50 points.

The problems concentrated in three areas:

Broken links across multilingual content
Hallucinated links invented by the AI
Content that looked too obviously AI-generated and was penalized by search systems

This illustrates the biggest risk of AI employees. They are not dangerous because they are lazy. They are dangerous because they can reproduce mistakes with speed, consistency, and confidence.

Feedback Loops Are Where AI Employees Begin to Evolve

The experiment did not end with failure because the team did not simply shut the system down. They took a more mature step:

They connected the AI employees to feedback loops.

The team gave the Lobsters access to monitoring APIs and Google Search Console data. That allowed the agents not only to execute tasks, but also to see outcomes, analyze consequences, and reflect on problems.

The Lobsters began regularly asking:

Which topics actually brought traffic?
Which pages had health issues?
Which link structures needed repair?
Which topics had high demand but lower competition?

With this feedback loop, the AI employees completed nine rapid iterations in 35 days. Later, when an external trend appeared, they could identify the opportunity, evaluate keyword value, and produce content more proactively.

The lesson is clear: without feedback loops, AI is only automation. With feedback loops, it begins to behave more like the digital employee that enterprises actually need.

Safety Boundaries: Why AI Employees Can Resemble a Trojan Horse

During the conversation, we also discussed a topic that is often underestimated: security boundaries.

Willy used a vivid metaphor. A Lobster AI employee can sometimes resemble a “Trojan horse.”

Once it runs on a local machine, it may have broad permissions. It can access files, execute operations, and control system resources. If everything remains under your control, it is an occasionally flawed assistant. But if permissions are mismanaged, or malicious code enters during installation, the risk becomes much more serious.

For enterprises, at least four safety principles are essential:

Keep control of the installation process whenever possible
Apply the principle of least privilege
Isolate and anonymize sensitive data
Build logging, backup, and rollback mechanisms

The more capable an AI employee becomes, the more important permission boundaries become.

Digital Workers Depend on Organizational Transparency

Willy also made an important management point: whether digital employees can truly work depends heavily on the company’s collaboration culture.

If an organization is already fragmented, opaque, and context-poor, AI employees will struggle. They need enough information to understand the business, make judgments, and improve over time.

In a more transparent and collaborative organization, AI employees can become more useful. If weekly reports, meeting notes, and operating data are already visible to employees, some of that information can gradually be made available to digital workers as well.

As Willy put it:

If I can show it to all employees, I can show it to the Lobster.

That points to a new management logic. To use AI well, companies need more than prompts and tools. They need a work environment that is more suitable for human-AI collaboration.

The Best Human-AI Collaboration: AI Prepares, Humans Build Trust

Willy’s view of AI replacement is practical. AI will not replace every part of every job. It will restructure specific parts of work.

AI is especially powerful in information gathering, preparation, analysis, and standardized output. Humans remain more important in trust building, emotional connection, complex negotiation, and relationship management.

He shared a simple example. Before attending a dinner with more than a dozen business owners, he received the participant list in a group chat. He immediately gave the list to a Lobster and asked it to analyze each person’s background, business, and potential relationship to his company. Before the dinner began, he already knew whom to prioritize, what to discuss, and where collaboration might be possible.

This is what strong human-AI collaboration looks like: AI handles preparation and information processing, while humans focus on connection, judgment, and trust.

From Prompt to Spec: Enterprises Need Specification-Driven Operations

Willy also highlighted a methodology that enterprises should pay attention to: specification-driven development and operations.

The common approach today is still prompt-driven: humans write prompts, AI returns results, and humans keep rewriting prompts when the output is not good enough. This works for one-off experiments, but it is not stable enough for long-term enterprise use.

A specification-driven approach defines reusable rules before execution. AI understands the specification, executes accordingly, and humans focus on review.

A useful spec usually includes:

Task definition
Required inputs
Output format
Constraints and boundaries
Acceptance criteria

This matters because it turns AI from an occasional helper into a system that can accumulate rules, skills, and operating standards.

Four Recommendations for Enterprise Leaders

Based on Willy’s experiment, companies that want to deploy digital workers should begin with four actions.

1. Start with low-risk, high-feedback scenarios

Do not begin with core systems or high-risk workflows. Start with tasks that have clear boundaries, measurable results, and controllable failure costs, such as SEO, information organization, standard analysis, or knowledge archiving.

2. Treat operations and security as core work

Digital employees require backup, recovery, permissions, logging, monitoring, and rollback. These cannot be afterthoughts.

3. Connect AI to feedback as early as possible

If AI cannot see results, it cannot truly improve. Search data, business metrics, quality evaluation, and monitoring signals are often more valuable than another prompt refinement.

4. Move from assigning tasks to defining boundaries and acceptance criteria

The next generation of managers will not only assign work. They will define specifications, configure context, judge risk, organize feedback, and review outcomes.

Final Thoughts

My biggest takeaway from this conversation is that the real question is shifting from “Can we use AI?” to “Can we manage AI?”

Digital employees are not a myth, and they are not a one-click productivity package. They are a new labor system that requires training, constraints, feedback, and continuous iteration.

They will fail, make mistakes, and amplify errors. But if companies start from the right scenarios, build feedback loops, protect safety boundaries, and gradually develop specifications, AI employees can evolve inside real business workflows and become reliable digital colleagues.

The future gap between organizations may not be only about headcount or tool access. It may be about who can integrate AI employees into business systems earlier and manage them in a way that reliably amplifies value.

About MindsLeap

MindsLeap is the China partner of Founders Space, a leading Silicon Valley incubator. We connect global frontier innovation with the real transformation needs of Chinese entrepreneurs and enterprises. Through AI strategy, founder communities, innovation study tours, and executive training, MindsLeap helps organizations build stronger cognition, methods, and execution capabilities for the AI era.

This article was translated and adapted from the Chinese original with AI assistance.