Complete Guide to URL Encoding: When Spaces Become %20 (and When They Become +)
Every developer has seen %20 in a URL. Many have seen + in query strings. Fewer developers can explain why both represent a space, when to use which, and what other characters need encoding.
The short answer: %20 is the general-purpose percent-encoding for spaces. The + sign is a legacy convention from application/x-www-form-urlencoded form data. But there's more to URL encoding than spaces — and getting it wrong produces bugs that are notoriously hard to trace.
What URL Encoding Actually Does
URL encoding (officially called percent-encoding) converts characters that have special meaning in URLs into a safe format. The encoded character is replaced by % followed by its two-character hexadecimal ASCII code.
Space (decimal 32, hex 20) → %20
Colon (decimal 58, hex 3A) → %3A
Hash (decimal 35, hex 23) → %23
A URL character can be:
- Unreserved (A-Z, a-z, 0-9,
-,_,.,~) — always safe, never encoded - Reserved (
:,/,?,#,[,],@,!,$,&,',(,),*,+,,,;,=) — safe only in their specific structural role - Other (space, Unicode characters, binary data) — must be encoded
The Complete Reserved Characters Table
| Character | Used For | Encoded |
|---|---|---|
: | Scheme/port separator | %3A |
/ | Path separator | %2F |
? | Query string start | %3F |
# | Fragment identifier | %23 |
@ | Auth info | %40 |
& | Query parameter separator | %26 |
= | Key-value separator | %3D |
+ | Space (form data legacy) | %2B |
$ | Sub-delimiter | %24 |
, | Sub-delimiter | %2C |
; | Sub-delimiter | %3B |
%20 vs + : The Difference
This is the most common point of confusion. Both represent a space character, but they belong to different encoding contexts.
%20: Standard Percent-Encoding
%20 is the correct percent-encoding for space in all URL parts — path segments, query strings, fragment identifiers. It works everywhere:
https://example.com/search?q=hello%20world
+: application/x-www-form-urlencoded Legacy
The + sign represents a space only in query strings parsed under the application/x-www-form-urlencoded MIME type. This comes from the HTML form submission specification, which predates modern URL standards.
Submit an HTML form with method="GET" and a field containing a space:
<form action="/search" method="GET">
<input name="q" value="hello world">
</form>
The browser generates:
/search?q=hello+world
The server-side URL parser then decodes + back to a space — but only in the query string. In path segments, + is literal:
https://example.com/a+b → path segment "a+b"
https://example.com/?key=a+b → query value "a b"
Language-Specific Behavior
JavaScript:
// encodeURIComponent uses %20 for spaces
encodeURIComponent("hello world");
// "hello%20world"
// The URLSearchParams API decodes + as space
new URLSearchParams("q=hello+world").get("q");
// "hello world"
Python:
from urllib.parse import urlencode, quote
# quote uses %20 by default
quote("hello world")
# 'hello%20world'
# urlencode uses +
urlencode({"q": "hello world"})
# 'q=hello+world'
# quote with safe='' uses %20 and can use safe parameter
# To use + instead:
quote("hello world", safe='')
# 'hello%20world' -- still %20 by default in Python 3
# Python's urlencode uses + for spaces (legacy form encoding)
from urllib.parse import urlencode
urlencode({"q": "hello world"})
# 'q=hello+world'
Java:
import java.net.URLEncoder;
import java.net.URLDecoder;
// URLEncoder uses + for spaces (follows application/x-www-form-urlencoded)
String encoded = URLEncoder.encode("hello world", "UTF-8");
// "hello+world"
// URLDecoder decodes + as space
String decoded = URLDecoder.decode("hello+world", "UTF-8");
// "hello world"
C#:
// HttpUtility.UrlEncode uses %20
System.Web.HttpUtility.UrlEncode("hello world");
// "hello%20world"
// WebUtility.UrlEncode also uses %20
System.Net.WebUtility.UrlEncode("hello world");
// "hello%20world"
PHP:
// urlencode uses + (legacy form encoding)
urlencode("hello world");
// "hello+world"
// rawurlencode uses %20 (RFC 3986)
rawurlencode("hello world");
// "hello%20world"
The PHP choice is instructive: urlencode() follows the form-encoding convention (+), while rawurlencode() follows the RFC 3986 percent-encoding standard (%20). When building query strings with PHP, use http_build_query() for form encoding and rawurlencode() for path segments.
encodeURI vs encodeURIComponent
JavaScript provides two functions for URL encoding, and knowing the difference prevents half of all URL-related bugs.
const url = "https://example.com/search?q=hello world&category=books & more";
// encodeURI: encodes the full URL, preserves structural characters
encodeURI(url);
// "https://example.com/search?q=hello%20world&category=books%20&%20more"
// Problem: & is NOT encoded — it breaks query parameter parsing!
// encodeURIComponent: encodes everything, including structural characters
encodeURIComponent("hello world&category=books & more");
// "hello%20world%26category%3Dbooks%20%26%20more"
The rule is simple:
| Function | Encodes | Preserves | Use for |
|---|---|---|---|
encodeURI | Spaces, Unicode, some special chars | :, /, ?, #, &, =, @ | Full URLs (rarely needed) |
encodeURIComponent | Everything except unreserved | Nothing | Query parameter values, path segments |
Wrong:
const base = "https://api.example.com/search";
const query = "q=" + encodeURI("books & more");
// Result: q=books%20&%20more — the & breaks parameter parsing!
Right:
const base = "https://api.example.com/search";
const query = "q=" + encodeURIComponent("books & more");
// Result: q=books%20%26%20more — single parameter value
// Even better — use URLSearchParams
const params = new URLSearchParams({ q: "books & more" });
const fullUrl = `${base}?${params}`;
What Characters Must Always Be Encoded
Some characters break URL parsing regardless of context and must be percent-encoded:
- Space — breaks URL tokenization
- Control characters (0x00–0x1F, 0x7F) — illegal in URLs
- Non-ASCII characters — must be UTF-8 encoded then percent-encoded
%— the escape character itself; encoded as%25
Unicode and UTF-8 in URLs
Modern URLs can contain Unicode characters, but they must be UTF-8 encoded first, then each byte percent-encoded:
Original: https://例子.测试
Punycode: https://xn--fsqu00a.xn--0zwm56d
Percent: https://%E4%BE%8B%E5%AD%90.%E6%B5%8B%E8%AF%95
JavaScript handles this automatically:
encodeURIComponent("测试");
// "%E6%B5%8B%E8%AF%95"
The sequence: character → UTF-8 bytes → each byte → hex → % prefix. Three bytes for most CJK characters.
Double Encoding: The Debugging Nightmare
Double encoding happens when you encode an already-encoded string:
const value = "hello world";
const once = encodeURIComponent(value); // "hello%20world"
const twice = encodeURIComponent(once); // "hello%2520world"
On the receiving end, one decode gives hello%20world instead of hello world. Two decodes give the right result. If the server decodes once (standard behavior), your %25 becomes %, and the space never appears.
Real-world example — OAuth redirect URLs:
// Wrong: double encoding
const redirect = "https://myapp.com/callback?code=abc123";
const url = `https://auth.provider.com/auth?redirect_uri=${encodeURIComponent(encodeURIComponent(redirect))}`;
// Right: encode once
const url = `https://auth.provider.com/auth?redirect_uri=${encodeURIComponent(redirect)}`;
Encoding in Different Parts of a URL
Each URL component has different encoding rules:
https://user:pass@host.com:8080/path/to/page?query=value#fragment
\____/ \_______/ \_______/ \__/\__________/\___________/\_______/
scheme userinfo host port path query fragment
| Component | Encoding Rules |
|---|---|
| Scheme | Only unreserved + + (for https+) — almost never encode |
| Userinfo | Encode :, @, /, ?, # |
| Host | Punycode for internationalized domains, not percent-encoding |
| Port | Digits only — no encoding |
| Path | Encode ?, #, [, ], ", <, >, space. Allow / and : |
| Query | Encode &, =, +, #, space. Also encode + itself as %2B |
| Fragment | Encode #, space. Less critical (never sent to server) |
How to Use the URL Encoder Tool
If you're manually encoding parameters during development or debugging, the URL encoder & decoder handles all the edge cases: it correctly distinguishes between %20 and +, handles full URLs vs parameter values, and shows both the raw and encoded forms side by side.
Common debugging workflows:
- Paste a URL that's failing. If it contains unencoded spaces or Unicode characters, that's your bug.
- Paste a parameter value. Toggle between
%20mode and+mode depending on whether you're building a query string or a path segment. - Copy the encoded output and use it directly in your code or curl command.
FAQ
Why do some APIs accept both %20 and + for spaces?
Because they decode the query string using general URL parsing plus the application/x-www-form-urlencoded convention. Most web frameworks handle both automatically.
Should I use %20 or + in API query parameters?
Use %20. It's the RFC 3986 standard. The + convention is legacy form encoding. Modern APIs should parse both, but %20 is unambiguous.
Does encodeURIComponent encode the + sign?
Yes. encodeURIComponent("hello+world") returns "hello%2Bworld". The + is treated as a literal character that needs encoding.
What happens if I don't encode a URL?
The browser or HTTP client may reject the request, truncate the URL at the first space or special character, or interpret unencoded characters as structural elements (& becomes a parameter separator, # becomes a fragment identifier).
How do I encode a URL in the terminal?
curl --data-urlencode "query=hello world" https://api.example.com/search
The --data-urlencode flag handles encoding automatically. For manual encoding in scripts:
# Using Python
python3 -c "import urllib.parse; print(urllib.parse.quote('hello world'))"
Is URL encoding the same as HTML entity encoding?
No. URL encoding uses % followed by hex digits. HTML entity encoding uses & followed by a name or number (&,  ). They serve different purposes and are not interchangeable.
When you're debugging a URL that won't parse or a query parameter that disappears on the server, the first thing to check is whether special characters have been properly encoded. The URL encoder & decoder gives you an immediate visual check — paste the problematic URL, and any unencoded characters are highlighted before you transform them.