πŸ”— URL Encoder / Decoder

Last updated: June 16, 2026

πŸ”— URL Encoder / Decoder

Full URI mode: Preserves ://, /, ?, &, # β€” safe for encoding a complete URL without breaking its structure.
Input: 0 chars
Output: 0 chars
Change: β€”

There's a persistent belief among developers that URL encoding is just "turning spaces into %20." Slap encodeURIComponent on everything, ship it, done. But URLs break in production anyway, API responses come back garbled, and someone opens a ticket. The problem isn't that percent-encoding is hard β€” it's that most developers never properly learned which characters need encoding, where, and why the rules differ depending on where in the URL you're working.

The Myth: One Function Encodes Everything Correctly

Ask a developer to encode a URL and they'll reach for encodeURIComponent without hesitation. Run this on https://example.com/search?q=hello world and you'll get https%3A%2F%2Fexample.com%2Fsearch%3Fq%3Dhello%20world β€” a string that's completely useless as a navigable link. The slashes got encoded. The colon disappeared into %3A. Every delimiter that gives a URL its meaning has been eaten alive.

This happens because encodeURIComponent was never meant for full URLs. It's designed for encoding the value inside a query parameter, not the URL itself. The sibling function encodeURI handles full URIs by deliberately preserving structural characters: :, /, ?, &, #, and a handful of others that define a URL's anatomy. Using the wrong function is one of the most common URL-related bugs in frontend and backend code alike.

What RFC 3986 Actually Says

Every URL lives under RFC 3986, the specification that defines what a Uniform Resource Identifier is. It draws a hard line between reserved characters and unreserved characters. Reserved characters β€” like /, ?, #, =, & β€” carry structural meaning and must only be percent-encoded when they appear as literal data, not as delimiters. Unreserved characters β€” letters, digits, -, _, ., ~ β€” never need encoding and shouldn't be encoded.

Everything else is a candidate for percent-encoding: a percent sign followed by two uppercase hexadecimal digits representing the byte value. A space becomes %20 because the ASCII decimal value of a space is 32, and 32 in hexadecimal is 20. An at-sign (@) becomes %40. An ampersand used as literal data in a query parameter value becomes %26, so it doesn't get mistaken for the delimiter between parameters.

Full URI Mode vs. Component Mode β€” The Core Distinction

This is the distinction most tutorials gloss over. When you have a complete URL like https://api.example.com/v1/search?q=cafΓ© au lait&lang=fr, you want to encode only the parts that need it β€” the spaces and the accented character β€” while leaving the protocol, host, path separators, and query delimiters untouched. That's Full URI mode, which maps directly to JavaScript's encodeURI / decodeURI.

When you're building a URL programmatically and you have a raw parameter value β€” say, the user typed Tom & Jerry into a search box β€” you need every special character encoded before you concatenate it into the query string. If you left the & raw, the server would read it as a parameter delimiter and split your value in half. That's Component mode, using encodeURIComponent / decodeURIComponent, which encodes the reserved characters too.

The practical rule: use Component mode for values, Full URI mode for complete addresses. Never swap them.

The Plus Sign Problem Nobody Talks About

HTML forms built before the modern fetch API encode spaces as + rather than %20. This is called application/x-www-form-urlencoded format, and it's a completely different encoding scheme from standard percent-encoding. When a server returns name=John+Doe and you decode it with decodeURIComponent, you get John+Doe β€” the plus stays a plus, not a space. You'd need to replace + with %20 first, or use a library aware of form encoding.

This silently corrupts data in countless apps. A user with a + in their name or a search query containing "C++" will find their input mangled. If you're consuming query strings from classic HTML form submissions, always check which encoding the form used before blindly decoding.

Unicode Characters and Multi-Byte Encoding

Spaces are simple. Unicode characters are where things get genuinely interesting. The cafΓ© example above: the letter Γ© is U+00E9. In UTF-8, it encodes to two bytes: 0xC3 and 0xA9. As a percent-encoded sequence, it becomes %C3%A9. Both encodeURI and encodeURIComponent handle this correctly in every modern browser and Node.js runtime β€” they always use UTF-8 byte representation. The danger comes from older systems that used Latin-1 or other encodings, where Γ© would encode as just %E9. Mix the two and your server will read mojibake instead of text.

Chinese, Arabic, Hindi, and emoji characters follow the same UTF-8 pattern but expand to more bytes. The emoji πŸ˜€ is U+1F600, encoded as four UTF-8 bytes: F0 9F 98 80, giving you %F0%9F%98%80 in a URL. Legitimate, valid, and increasingly common as URLs carry richer content.

Where Decoding Goes Wrong

Decoding has its own failure modes. The most frequent: a malformed percent sequence. If someone passes %ZZ, there's no valid hex representation β€” both decodeURI and decodeURIComponent throw a URIError: URI malformed exception. Code that calls these functions without try/catch will crash. This is exactly the kind of bug that doesn't show up in development (where inputs are controlled) but surfaces immediately when real users paste URLs from sources you didn't anticipate.

Double-encoding is another trap. A URL that's been encoded twice β€” %2520 instead of %20 (because %25 is the encoding for %) β€” will decode to %20 on the first pass, not to a space. Servers see a literal percent-20 in the path, routing fails, and developers spend an afternoon in confusion.

The Fragment Identifier Edge Case

URL fragments β€” the #section part β€” are never sent to the server. The browser strips them before making the HTTP request. This means encoding a fragment identifier is pointless from a server-routing perspective, but it matters for client-side JavaScript that reads window.location.hash. If your single-page app uses hash-based routing and the fragment contains special characters, you need to handle encoding and decoding on the client yourself, because the server never sees it.

A Quick Field Guide

Before reaching for an encoder, ask three questions. First: is this a complete URL or a fragment of one? Complete URL β€” use Full URI mode. Raw value going into a parameter β€” use Component mode. Second: where did this string come from? Form submission may mean plus-encoded spaces. Third: has this already been encoded? If you see % signs in the input, decoding first before re-encoding prevents the double-encoding trap.

Percent-encoding exists because URLs were designed for ASCII and the web became global. The rules are precise but learnable, and once you internalize the Full URI vs. Component distinction, most URL-related bugs stop being mysterious. They become predictable, catchable, and preventable β€” which is the best thing any encoding scheme can aspire to be.

FAQ

What is the difference between encodeURI and encodeURIComponent?
encodeURI encodes a complete URL, leaving structural characters like ://?&# intact so the URL remains navigable. encodeURIComponent encodes everything including those structural characters, making it suitable only for encoding individual query parameter values before they're inserted into a URL.
Why does my decoded URL still show a plus sign instead of a space?
HTML form submissions often use application/x-www-form-urlencoded format, which encodes spaces as + rather than %20. Standard decodeURIComponent does not convert + to space β€” you need to replace + with %20 (or a literal space) before decoding, or use URLSearchParams which handles this automatically.
When does decoding throw an error?
Both decodeURI and decodeURIComponent throw a URIError when they encounter an invalid percent-encoded sequence, such as %ZZ (no valid hex digits) or a lone % at the end of the string. Always wrap decode calls in a try/catch block when processing user-supplied or untrusted input.
How are non-ASCII characters like Chinese or emoji encoded in URLs?
They are first converted to their UTF-8 byte representation, then each byte is percent-encoded separately. For example, the emoji πŸ˜€ (U+1F600) encodes to four UTF-8 bytes, resulting in %F0%9F%98%80 in the URL. Modern browsers and JavaScript runtimes handle this automatically.
What is double-encoding and how do I avoid it?
Double-encoding happens when an already-encoded string gets encoded again. The percent sign in %20 becomes %25, turning %20 into %2520. The fix is to decode the string first to check if it's already encoded, or to avoid encoding values that have already been processed. The Swap button in this tool makes it easy to test a round-trip.
Should I encode URL fragments (the # part)?
URL fragments are stripped by the browser before the HTTP request is sent, so the server never receives them. Encoding a fragment matters only if client-side JavaScript reads window.location.hash and needs to handle special characters in hash-based routing. For server-side routing, encoding the fragment has no practical effect.