🗓️

iOS

PTer

This app provides a comprehensive report to support effective presentation practice. It analyzes whether your voice volume is appropriate, your pronunciation is clear, and your speaking speed is too fast or too slow. By repeating the same presentation multiple times, users can also generate comparison reports to track how much they’ve improved.

podcast mic


My Role

Design & Planning

  • Wireframing

  • UX flow design

  • Research on sound-related technologies

  • Research on Apple technologies


Development

  • Speech-to-text (STT) using Apple’s Speech Framework

  • Implemented Korean STT

  • Pronunciation clarity evaluation based on model confidence

  • Overall view implementation

  • Project folder structure setup

  • Backlog creation


Tech Stack

  • Figma

  • Xcode + SwiftUI

  • AVFoundation

  • Speech (Apple Speech Framework)

  • WhisperKit

  • Lottie

  • GitHub


Technical Overview

This project was built with the goal of leveraging Apple technologies to their fullest, with Sound as the core theme.

We primarily used AVFoundation for audio/video playback and editing, and incorporated Apple Speech and WhisperKit for speech-to-text functionality.


Challenges & Solutions

One major challenge was figuring out how to evaluate pronunciation clarity.

WhisperKit provided accurate results but was too slow, and the ETRI (Electronics and Telecommunications Research Institute) API allowed only one sentence per request, limiting usability.

In the end, we chose to use Apple’s Speech Framework, evaluating pronunciation clarity based on the confidence values returned by the model.