acento.io
Text utility

Character Counter

Real-time character, codepoint, and byte counts — fully Unicode-aware, 100% client-side so your text never leaves your browser.

By Carlos Suárez , Systems engineer
Last updated:

What this character counter does

This English-language character counter goes beyond a simple tally. It separately tracks graphemes (what you visually see as one character), characters without spaces, raw Unicode codepoints, and the UTF-8 byte size of your text — all updated as you type. That distinction matters more than most people realize: the family emoji 👨‍👩‍👧 is one grapheme but seven codepoints, and JavaScript's built-in string.length returns neither — it counts UTF-16 code units instead, as noted in the MDN documentation for string handling. Whether you're trimming a tweet, auditing a meta description, or validating a database column, the right number depends on which unit the platform actually enforces. Everything runs entirely in your browser — 100% client-side — your data never leaves your device. No uploads, no tracking, no server logs.

Features

  • Grapheme counting. Counts user-perceived characters using Unicode segmentation rules, so emoji sequences and combining marks are counted as one unit — the same way Twitter and most modern platforms measure length.
  • No-spaces count. Shows characters excluding whitespace, useful when style guides like Strunk & White-era editorial standards or academic submission limits exclude spaces from the count.
  • Raw codepoint count. Displays the number of Unicode scalar values in the string — the figure Python's len(text) returns and what databases often use internally for indexing.
  • UTF-8 byte size. Reports the encoded byte length, critical when a database column is defined as VARCHAR(255 BYTES) rather than VARCHAR(255 CHARACTERS) — a common source of silent truncation in production migrations.
  • Most common character. Identifies which character appears most frequently, handy for spotting runaway punctuation, invisible Unicode, or unexpected repeated tokens in generated text.
  • Real-time updates. Every metric refreshes on each keystroke with no debounce lag, so you can paste a block of text and see all counts instantly without clicking anything.

How to use the character counter

Paste or type text into the input area and all counts update immediately. Use the Clear button to reset.

  1. Paste or type your text. Click the text area and type, or paste from your clipboard. The counter starts as soon as there is input.
  2. Read the right metric for your platform. Use **Characters** for Twitter (280-grapheme limit) or Bluesky (300). Use **UTF-8 bytes** for database VARCHAR byte limits. Use **Codepoints** when working with low-level string APIs — note that [...text].length in JavaScript gives graphemes via spread, while text.length gives UTF-16 units.
  3. Check no-spaces count for SEO copy. Meta descriptions should stay around 155–160 characters including spaces; title tags around 60. The no-spaces figure helps when a style brief specifies a character budget that excludes whitespace.
  4. Clear and start over. Hit the Clear button to empty the field and reset all counters to zero.

Common use cases

  • Social media copy. Twitter/X enforces a 280-grapheme limit and counts most CJK characters as two weighted units. Checking your post here before publishing avoids the frustrating surprise trim when you hit Post.
  • SMS segment budgeting. A plain-text SMS fits 160 GSM-7 characters per segment, but a single emoji forces the whole message into UCS-2 encoding, dropping the limit to 70 characters per segment. Each extra segment costs extra — knowing the byte breakdown helps you keep messages to one segment.
  • SEO title and meta description auditing. Search snippets truncate title tags around 60 characters and meta descriptions around 155–160. Writers working on content for cities like Toronto or Seattle often draft in English first and audit character counts before handing off for localization.
  • Database column validation. Before running a migration that writes user-generated content into a VARCHAR(255) column, paste a worst-case sample here and compare the UTF-8 byte count against the column definition. A string that is 255 graphemes can exceed 255 bytes once emoji or CJK characters are involved.
  • Word-count benchmarking. Studies of top-ranking Google results suggest the first position averages around 1,447 words. If you also need to track word count alongside character count, our [word counter](/en/word-counter/) handles that in the same real-time style.

Frequently asked questions

Does this tool send my text to a server?

No. The entire counter runs in your browser as client-side JavaScript. Nothing is transmitted, stored, or logged anywhere. You can disconnect from the internet and it will still work.

Why do graphemes and codepoints show different numbers?

Unicode allows multiple codepoints to combine into a single visible character. The family emoji 👨‍👩‍👧 is composed of four person emoji joined by three Zero Width Joiner (U+200D) codepoints — one grapheme, seven codepoints. Platforms that enforce a character limit (Twitter, SMS gateways) typically count graphemes, not codepoints.

Why does JavaScript's string.length give a different number?

JavaScript internally stores strings as UTF-16. Characters outside the Basic Multilingual Plane — including most emoji — are encoded as surrogate pairs and count as 2 in string.length. To get graphemes in JS, use [...text].length via the spread operator, which respects Unicode segmentation.

How does the byte count relate to database VARCHAR limits?

It depends on how the column was defined. MySQL VARCHAR(255) in a utf8mb4 table means 255 characters, but some older schemas or storage engines define the limit in bytes. A single emoji can occupy 4 UTF-8 bytes, so a 100-grapheme string with emoji may exceed a 100-byte column limit. Always check the column's character set and length semantics before a migration.

Why does a message with one emoji cost more to send via SMS?

Standard SMS uses GSM-7 encoding, which fits 160 characters per segment. Once any character outside the GSM-7 alphabet is present — including any emoji — the entire message switches to UCS-2, reducing capacity to 70 characters per segment. Multi-segment messages are billed as multiple messages by most carriers.

Is there a way to count characters programmatically instead of pasting here?

Yes. In Python, len(text) counts codepoints. For graphemes you need the grapheme library or regex with \X. In Bash, wc -m file.txt counts characters (codepoints on most modern locales). For byte size in Bash, use wc -c. The NIST SP 800-63B digital identity guidelines also specify that password length should be measured in Unicode codepoints — a useful reminder that the choice of unit has real security implications.