Phi-3.5 vs TinyLlama vs Deepseek-R1 vs GPT-4o Mini: Creating a User Persona
To gain a better understanding of how local AI models perform compared to cloud AI, I ran benchmarks for Microsoft’s Phi-3.5 model (MIT License), TinyLlama, (Apache-2.0 License), Deepseek-R1 (MIT License), and GPT-4o Mini (Free, Proprietary). Benchmark results are based on the length of time it takes for both models to generate a persona.
Specs for Machine running Tests
Testing with these specs will reveal the performance of three local models, in comparison with ChatGPT Free, with the least ideal setup for the most realistic scenario
- 16 GB Memory
- AMD Ryzen 3 3200g CPU, Released on July 7, 2019
- 4-core, 4-threat CPU
- Integrated Vega 8 Graphics
- 65w power draw
- Picasso Architecture
- Socket AM4
- Diamond AMD Radeon HD 7970, Released on January 9, 2012
- 3GB GDDR5 Video Memory
- HDMI, VDI, and Mini Displayport
- Tahiti XT architecture
- 250w power draw
- 550w Power Supply
- Ubuntu Linux
- 25.04 “Plucky Puffin”
- 6.14.0-15-generic
- OpenZFS File System
- Wine-9.0 (Dot 0.9.3 is a Windows frontend to Phi-3.5)
Generating Results
Prompt:
Generate a UX user persona of a man in his late-30s who has resistance to using AI due to privacy concerns. He wishes to live his private life in peace without sharing every detail of his personal life to social media. He never wants any information about his medical conditions, details about his wife and kids, his medications, and sexuality to be shared with any company or video sharing platform. Anything deeply personal is to only be shared with medical professionals he trusts. He fears that any information he shares with ChatGPT will be compromised and sold to a third party without his consent.
Phi-3.5 – Duration: 01:25:9 (on hardware, Dot 0.9.3, CPU only)
Issues Discovered with Dot UI
- Text field for typing message cannot be resized
- The dropdown to switch between Doc Dot and Big Dot closes prematurely, therefore not giving enough time to click an option
- Previous responses are not saved
- User persona was not completed (stopped mid-way)
Pros
- Model breaks the persona into headings, bulleted lists, and summary
- Includes gender, occupation, and location
Cons
- Persona is basic with only a few paragraphs detailing the user’s backstory
- Model stopped mid-way
Overall Quality
The persona was lacking in depth. Didn’t dive deep into the user’s motivations and reasons for their skepticism.
Deepseek-R1 – Duration (excluding <think> intro): 01:52:4 (on hardware, Ollama, CPU & GPU)
Pros
- Persona was detailed with a complete backstory behind the user, with reasons for resisting AI and steps to address them
- Included goals, compromises, design implications, and features to consider
Cons
- Thinking process beforehand significantly increased generation time
- Cannot not run with too many tasks open on low-spec hardware. I had to close other programs to get the model to work.
Overall Quality
The generated persona included impressive details that put the designer in the user’s shoes, all within a similar timeframe to Phi-3.5
TinyLlama – Duration: 00:21:0 (on hardware, Ollama, CPU & GPU)
Pros
- Model is only 662MB in size
- Ran fast on specs
Cons
- Persona is basic with only a few paragraphs detailing the user’s backstory
Overall Quality
The quality of the persona generated by TinyLlama was comparable to Phi-3.5, but with less memory and loading time.
GPT4o Mini – Duration: 00:34:4 (in cloud)
Pros
- Quick generation time
- Excellent details to tie everything right back to the customer
- “Just because I use the internet doesn’t mean I want to live in public.” – brilliant quote
Cons
- Within the cloud – must have an OpenAI account to use
- Not private
Overall Quality
Both Deepseek-R1 and GPT4o Mini delivered similar results. However, the one aspect missing from Mini’s persona was the empathy. Deepseek’s persona offered an impressive look into the life of a user that doesn’t exist.
Key Takeaways
- Complex models will take longer to generate best results on mid-spec hardware. Users with these specs will be disappointed by the slower generation times, unless explained clearly in the application why this is
- Smaller models intended for low-to-mid-spec hardware may generate responses faster, but are less complex than more advanced models
Winner: Deepseek-R1
UX User Persona: Alex
Age: 36
Gender: Male
Nationality: United States
Background
Alex is a 36-year-old IT professional who resides in the U.S. He has
developed a strong reliance on AI for various aspects of his life,
including personal productivity and decision-making. However, this
reliance has been precipitated by his growing awareness of the ethical
implications of AI.
Privacy Concerns
- Fear of Data Sale: Alex is deeply concerned about the potential sale of his personal data without consent from AI services
- Lack of Trust in Data Handling: He expresses skepticism towards large corporations and their practices regarding user data, especially when it comes to AI tools like ChatGPT
Goals
- To maintain control over his private life and avoid sharing sensitive information online
- To use AI tools without compromising his autonomy or privacy
Concerns
- Loss of Control: Alex fears that his personal information may be misused or sold without his knowledge
- Fear of Compromise: He is vigilant about how his data might fall into the wrong hands, especially when using AI services
Compromises
- Limited Information Sharing: Alex prefers to share minimal personal details with online platforms and AI tools
- Trust in Medical Professionals: He feels comfortable sharing health information only with trusted medical professionals
Design Implications
- Interface Design for Privacy: Use visual indicators such as a padlock or shield when messages are considered private, ensuring Alex knows these details can’t be shared externally
- Transparency in Data Handling: Clearly communicate that personal data will not be sold without consent at the point of input, e.g., “This message is only for me”
- Customizable Preferences: Allow users to set privacy boundaries for different platforms or AI tools, such as limiting shared health details on one platform but allowing
others - Focus on Privacy in AI Tools: Design AI interfaces with a prominent privacy policy highlight before data input, emphasizing security and user control
- Enhanced Do Not Track Functionality: Integrate a specific opt-out button for AI interactions to contact third parties without consent
- Reassurance Features: Provide reassurance through clear privacy statements or third-party seals when AI handles data securely, aligning with Alex’s need for trust
- Interaction Across Platforms and Services: Each platform should have tailored privacy policies reflecting Alex’s cautious approach to information sharing. Ensure that interactions online are governed by strict privacy controls, only allowing Alex to see limited details on specific platforms
- Trust in AI Systems: Build a reputation where AI tools prioritize user privacy and transparency through clear reassurance statements or seals when data is handled securely
This design ensures that Alex can maintain his autonomy and privacy while still using AI effectively, balancing his need for technology with his strong privacy safeguards.
Next Steps
- Create multiple user flow diagrams showing areas that clearly explain what local AI is and how it will not share any data with a third party
- Write copy and get input from independent voices
- Create lo-fi wireframes and prototype demonstrating how this experience may work
- Create multiple examples of error states that address lack of memory, crashes, and hiccups