
By BSD –
Curated by Business Science Daily — peer-reviewed sources, human-verified.
Learn more
About Our Curation Process
Business Science Daily curates academic research in business and economics. Each featured study is selected from reputable, peer-reviewed journals, institutional repositories, or working papers (e.g., Elsevier, Sage, NBER, SSRN).
Articles are carefully summarized to ensure clarity and accuracy, with direct citations or links to original sources. Our process emphasizes transparency, academic integrity, and accessibility for a broader audience.
Learn more in our Editorial Standards & AI Policy.
The use of AI in the workplace comes with clear benefits — but also trade-offs. The key question to ask is not simply whether employees use AI, but how they use it and which parts of their work are actually affected.
In fact the bigger picture of the research gap is not about the existing studies objectives in examining AI in controlled laboratory settings or focus on narrow, repetitive tasks. But focusing on different aspects of knowledge work. Relevantly, a consultant, lawyer, or analyst doesn’t perform the same task repeatedly—they juggle creative ideation, data analysis, persuasive writing, strategic reasoning, and client communication, often within a single project.
Therefore, it should be ask how does AI perform across this range of activities? And more importantly, how do humans collaborate with AI when the tasks themselves shift constantly?
The following working paper by Dell’Acqua, McFowland, Lifshitz-Assaf, Kellogg, Rajendran, Krayer, Candelon, and Lakhani looks closely at the specific tasks employees rely on AI for and how this changes the way they perform their jobs. It explores the practical reality of AI adoption: what workers use it for, how it shapes their workflow, and how it influences outcomes.
Navigating the Jagged Technological Frontier: Effects of AI on Knowledge Worker Productivity and Quality
Field experimental evidence with 758 BCG consultants shows AI inside its capability frontier boosts productivity by 12.2% and quality by 40%+, but outside the frontier it decreases accuracy by 19 percentage points.
The Jagged Technological Frontier
AI capabilities are uneven—tasks that appear similar in difficulty can be on different sides of AI’s capability boundary. Inside the frontier, AI dramatically boosts performance. Outside, it causes errors.
Inside frontier examples: Creative writing, brainstorming, drafting memos, idea generation
Outside frontier examples: Tasks requiring hidden context integration, nuanced data+interview synthesis, problems with traps
Core Findings:
- Inside the frontier: Consultants using AI completed 12.2% more tasks, worked 25.1% faster, and produced 40%+ higher quality outputs compared to control group.
- Outside the frontier: Consultants using AI were 19 percentage points less likely to produce correct solutions compared to those without AI.
- Skill distribution: Below-average performers improved by 43% with AI, above-average by 17%.
- Collaboration patterns: “Centaur” (strategic division of labor) and “Cyborg” (tight integration) approaches emerged.
- Homogenization effect: AI-assisted ideas were higher quality but less diverse across participants.
Methodology:
- Sample: 758 BCG strategy consultants (∼7% of global individual contributor cohort)
- Design: Pre-registered randomized experiment with baseline task, then random assignment to control, GPT-4 access, or GPT-4 + prompt engineering overview
- Inside frontier task: Creative product innovation (footwear) with 18 subtasks
- Outside frontier task: Business problem-solving with quantitative data and interviews containing hidden trap
Inside the Frontier: Quality & Productivity Booster
Performance by Skill Level:
+43%
Improvement in experimental task compared to baseline
+17%
Improvement in experimental task compared to baseline
AI benefits lower performers more, narrowing the skill gap.
Homogenization Effect:
AI-assisted ideas were higher quality but less variable across participants. Semantic similarity analysis showed reduced diversity of ideas among AI users.
High diversity of ideas
Lower average quality
Higher quality ideas
But everyone produces similar ideas
Key Results Table:
| Condition | Quality Score | vs Control | Completion Rate |
|---|---|---|---|
| Control (no AI) | 4.10 | — | 82.4% |
| GPT Only | 5.66 | +38% | 91.4% |
| GPT + Overview | 5.85 | +42.5% | 93.5% |
All effects significant at p<0.001. Quality measured on 1-10 scale by human graders.
Outside the Frontier: Quality Disruptor
Consultants analyzed a business case with financial data and interviews. The spreadsheet alone suggested one conclusion, but careful reading of interviews revealed the opposite answer. GPT-4 typically missed this nuance.
📊 AI’s approach: Look at spreadsheet numbers → pick Channel A
👤 Correct human approach: Read interviews → Channel B has hidden advantages
Correctness Results:
The Quality Paradox:
-19pp
AI groups were significantly less accurate
+25%
But their memos were rated more persuasive
Critical finding: AI generates fluent, persuasive text even when factually wrong. Humans often fail to catch errors because the output “looks good.”
Detailed Results Table:
| Condition | Correctness | vs Control | Recommendation Quality |
|---|---|---|---|
| Control | 84.4% | — | 5.86 |
| GPT Only | 70.5% | -13.9pp | 6.91 (+1.05) |
| GPT+Overview | 60.0% | -24.5pp | 7.34 (+1.48) |
Time Savings (outside frontier):
| Condition | Time Spent | vs Control |
|---|---|---|
| Control | 37.7 minutes | — |
| GPT Only | 30.9 minutes | -18% |
| GPT+Overview | 26.4 minutes | -30% |
Collaboration Strategies: Centaurs & Cyborgs
Analysis of user logs revealed two distinct patterns among successful consultants:
Centaur Strategy
Named after the mythical half-human/half-horse creature—users strategically divide tasks between human and AI based on relative strengths.
- Division of labor: Clear handoffs between human and AI
- Human leads: Data analysis, strategic thinking, core recommendations
- AI supports: Drafting, refining, polishing, formatting
Example workflow: “Human analyzes financial data → Human decides recommendation → AI drafts memo to CEO → Human reviews and edits”
Cyborg Strategy
Named after science fiction hybrid beings—users tightly integrate with AI through continuous back-and-forth iteration.
- Tight integration: Subtask-level collaboration
- Practices include: Assigning persona (“act as a consultant”), requesting editorial changes, teaching through examples, validating outputs, demanding logic explanations, pointing out contradictions
Example interaction: “Act as a consultant… → AI responds → Revise that, focus on X → AI revises → Explain your logic → AI provides reasoning → Point out contradiction → AI adjusts”
Comparison of Strategies
| Feature | Centaur | Cyborg |
|---|---|---|
| Relationship | Strategic division of labor | Tight integration |
| Handoffs | Clear, task-level | Continuous, subtask-level |
| AI role | Tool/assistant | Collaborator/partner |
| Human role | Director, decision-maker | Co-creator, validator |
Key Insight:
Both patterns emerged among successful users. The choice may depend on task type, user skill, and familiarity with AI. Some users switched between modes depending on the subtask.
“I did the thinking, AI did the writing. Perfect division.” — Centaur user
“We went back and forth until it got it right. It felt like a partner.” — Cyborg user
Retainment Analysis:
Average retainment (copying AI output directly) was 0.87 on a 0-1 scale. Higher retainment correlated with higher quality (coefficient 1.21, significant). Training increased retainment.
Implications & Future Research
Theoretical Contributions:
- Jagged technological frontier: AI capabilities are uneven—tasks of similar perceived difficulty may be on different sides of the frontier.
- Human-AI collaboration patterns: Identifies Centaur and Cyborg behaviors as distinct integration strategies.
- Performance heterogeneity: AI benefits bottom performers more, potentially democratizing expertise.
- Quality vs. correctness tradeoff: AI can improve persuasiveness while decreasing accuracy—a dangerous combination.
Organizational Risks:
- Training deficit: Firms may stop giving junior workers “inside frontier” tasks, stunting skill development.
- Homogenization: Everyone produces similar high-quality output → less innovation.
- Persuasive errors: AI makes wrong answers look convincing → harder to catch mistakes.
Practical Implications:
- Training matters: Prompt engineering overview improved performance inside frontier but increased over-reliance outside frontier—training must include awareness of limitations.
- Task selection is critical: Organizations need to map which tasks are inside vs. outside AI’s current frontier.
- Diverse AI ecosystem: Consider using multiple LLMs or human-only involvement to counteract homogenization.
Complete Results Summary Table:
| Experiment | Metric | Control | GPT Only | GPT+Overview |
|---|---|---|---|---|
| Inside Frontier | Quality (1-10) | 4.10 | 5.66*** | 5.85*** |
| Completion % | 82.4% | 91.4%*** | 93.5%*** | |
| Time (minutes) | 50.0 | 27.6*** | 29.5*** | |
| Outside Frontier | Correctness % | 84.4% | 70.5%*** | 60.0%*** |
| Recommendation Quality | 5.86 | 6.91*** | 7.34*** | |
| Time (minutes) | 37.7 | 30.9*** | 26.4*** |
*** p<0.001, ** p<0.01, * p<0.05. All comparisons vs control group.
Policy Implications:
- Responsible AI: Need for safeguards when AI is used for high-risk tasks.
- Education: Formal training needed to build frontier-navigation skills.
- Diverse AI ecosystem: Use multiple models to counteract homogenization.
References
Dell’Acqua, F., McFowland III, E., Mollick, E., Lifshitz-Assaf, H., Kellogg, K.C., Rajendran, S., Krayer, L., Candelon, F., & Lakhani, K.R. (2023). Navigating the jagged technological frontier: Field experimental evidence of the effects of AI on knowledge worker productivity and quality. Harvard Business School Working Paper 24-013.
Key References: Brynjolfsson, E., Li, D., & Raymond, L.R. (2023). Generative AI at work. NBER Working Paper w31161. • Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: An early look at the labor market impact potential of large language models. • Noy, S., & Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. • Lebovitz, S., Lifshitz-Assaf, H., & Levina, N. (2022). To engage or not to engage with AI for critical judgments. Organization Science, 33(1), 126–148.
Data Source: Randomized field experiment with 758 BCG consultants. Pre-registered design with baseline task, random assignment to control, GPT-4 access, or GPT-4 + prompt engineering overview.
Acknowledgement: Funding provided in part by Harvard Business School.