Complete Guide to URL Encoding: When Spaces Become %20 (and When They Become +)

Every developer has seen %20 in a URL. Many have seen + in query strings. Fewer developers can explain why both represent a space, when to use which, and what other characters need encoding.

The short answer: %20 is the general-purpose percent-encoding for spaces. The + sign is a legacy convention from application/x-www-form-urlencoded form data. But there's more to URL encoding than spaces — and getting it wrong produces bugs that are notoriously hard to trace.

What URL Encoding Actually Does

URL encoding (officially called percent-encoding) converts characters that have special meaning in URLs into a safe format. The encoded character is replaced by % followed by its two-character hexadecimal ASCII code.

Space (decimal 32, hex 20) → %20
Colon (decimal 58, hex 3A) → %3A
Hash  (decimal 35, hex 23) → %23

A URL character can be:

  • Unreserved (A-Z, a-z, 0-9, -, _, ., ~) — always safe, never encoded
  • Reserved (:, /, ?, #, [, ], @, !, $, &, ', (, ), *, +, ,, ;, =) — safe only in their specific structural role
  • Other (space, Unicode characters, binary data) — must be encoded

The Complete Reserved Characters Table

CharacterUsed ForEncoded
:Scheme/port separator%3A
/Path separator%2F
?Query string start%3F
#Fragment identifier%23
@Auth info%40
&Query parameter separator%26
=Key-value separator%3D
+Space (form data legacy)%2B
$Sub-delimiter%24
,Sub-delimiter%2C
;Sub-delimiter%3B

%20 vs + : The Difference

This is the most common point of confusion. Both represent a space character, but they belong to different encoding contexts.

%20: Standard Percent-Encoding

%20 is the correct percent-encoding for space in all URL parts — path segments, query strings, fragment identifiers. It works everywhere:

https://example.com/search?q=hello%20world

+: application/x-www-form-urlencoded Legacy

The + sign represents a space only in query strings parsed under the application/x-www-form-urlencoded MIME type. This comes from the HTML form submission specification, which predates modern URL standards.

Submit an HTML form with method="GET" and a field containing a space:

<form action="/search" method="GET">
  <input name="q" value="hello world">
</form>

The browser generates:

/search?q=hello+world

The server-side URL parser then decodes + back to a space — but only in the query string. In path segments, + is literal:

https://example.com/a+b       → path segment "a+b"
https://example.com/?key=a+b  → query value "a b"

Language-Specific Behavior

JavaScript:

// encodeURIComponent uses %20 for spaces
encodeURIComponent("hello world");
// "hello%20world"

// The URLSearchParams API decodes + as space
new URLSearchParams("q=hello+world").get("q");
// "hello world"

Python:

from urllib.parse import urlencode, quote

# quote uses %20 by default
quote("hello world")
# 'hello%20world'

# urlencode uses +
urlencode({"q": "hello world"})
# 'q=hello+world'

# quote with safe='' uses %20 and can use safe parameter
# To use + instead:
quote("hello world", safe='')
# 'hello%20world' -- still %20 by default in Python 3

# Python's urlencode uses + for spaces (legacy form encoding)
from urllib.parse import urlencode
urlencode({"q": "hello world"})
# 'q=hello+world'

Java:

import java.net.URLEncoder;
import java.net.URLDecoder;

// URLEncoder uses + for spaces (follows application/x-www-form-urlencoded)
String encoded = URLEncoder.encode("hello world", "UTF-8");
// "hello+world"

// URLDecoder decodes + as space
String decoded = URLDecoder.decode("hello+world", "UTF-8");
// "hello world"

C#:

// HttpUtility.UrlEncode uses %20
System.Web.HttpUtility.UrlEncode("hello world");
// "hello%20world"

// WebUtility.UrlEncode also uses %20
System.Net.WebUtility.UrlEncode("hello world");
// "hello%20world"

PHP:

// urlencode uses + (legacy form encoding)
urlencode("hello world");
// "hello+world"

// rawurlencode uses %20 (RFC 3986)
rawurlencode("hello world");
// "hello%20world"

The PHP choice is instructive: urlencode() follows the form-encoding convention (+), while rawurlencode() follows the RFC 3986 percent-encoding standard (%20). When building query strings with PHP, use http_build_query() for form encoding and rawurlencode() for path segments.

encodeURI vs encodeURIComponent

JavaScript provides two functions for URL encoding, and knowing the difference prevents half of all URL-related bugs.

const url = "https://example.com/search?q=hello world&category=books & more";

// encodeURI: encodes the full URL, preserves structural characters
encodeURI(url);
// "https://example.com/search?q=hello%20world&category=books%20&%20more"
// Problem: & is NOT encoded — it breaks query parameter parsing!

// encodeURIComponent: encodes everything, including structural characters
encodeURIComponent("hello world&category=books & more");
// "hello%20world%26category%3Dbooks%20%26%20more"

The rule is simple:

FunctionEncodesPreservesUse for
encodeURISpaces, Unicode, some special chars:, /, ?, #, &, =, @Full URLs (rarely needed)
encodeURIComponentEverything except unreservedNothingQuery parameter values, path segments

Wrong:

const base = "https://api.example.com/search";
const query = "q=" + encodeURI("books & more");
// Result: q=books%20&%20more — the & breaks parameter parsing!

Right:

const base = "https://api.example.com/search";
const query = "q=" + encodeURIComponent("books & more");
// Result: q=books%20%26%20more — single parameter value

// Even better — use URLSearchParams
const params = new URLSearchParams({ q: "books & more" });
const fullUrl = `${base}?${params}`;

What Characters Must Always Be Encoded

Some characters break URL parsing regardless of context and must be percent-encoded:

  • Space — breaks URL tokenization
  • Control characters (0x00–0x1F, 0x7F) — illegal in URLs
  • Non-ASCII characters — must be UTF-8 encoded then percent-encoded
  • % — the escape character itself; encoded as %25

Unicode and UTF-8 in URLs

Modern URLs can contain Unicode characters, but they must be UTF-8 encoded first, then each byte percent-encoded:

Original:  https://例子.测试
Punycode:  https://xn--fsqu00a.xn--0zwm56d
Percent:   https://%E4%BE%8B%E5%AD%90.%E6%B5%8B%E8%AF%95

JavaScript handles this automatically:

encodeURIComponent("测试");
// "%E6%B5%8B%E8%AF%95"

The sequence: character → UTF-8 bytes → each byte → hex → % prefix. Three bytes for most CJK characters.

Double Encoding: The Debugging Nightmare

Double encoding happens when you encode an already-encoded string:

const value = "hello world";
const once = encodeURIComponent(value);    // "hello%20world"
const twice = encodeURIComponent(once);    // "hello%2520world"

On the receiving end, one decode gives hello%20world instead of hello world. Two decodes give the right result. If the server decodes once (standard behavior), your %25 becomes %, and the space never appears.

Real-world example — OAuth redirect URLs:

// Wrong: double encoding
const redirect = "https://myapp.com/callback?code=abc123";
const url = `https://auth.provider.com/auth?redirect_uri=${encodeURIComponent(encodeURIComponent(redirect))}`;

// Right: encode once
const url = `https://auth.provider.com/auth?redirect_uri=${encodeURIComponent(redirect)}`;

Encoding in Different Parts of a URL

Each URL component has different encoding rules:

  https://user:pass@host.com:8080/path/to/page?query=value#fragment
  \____/ \_______/ \_______/ \__/\__________/\___________/\_______/
scheme  userinfo   host      port   path      query       fragment
ComponentEncoding Rules
SchemeOnly unreserved + + (for https+) — almost never encode
UserinfoEncode :, @, /, ?, #
HostPunycode for internationalized domains, not percent-encoding
PortDigits only — no encoding
PathEncode ?, #, [, ], ", <, >, space. Allow / and :
QueryEncode &, =, +, #, space. Also encode + itself as %2B
FragmentEncode #, space. Less critical (never sent to server)

How to Use the URL Encoder Tool

If you're manually encoding parameters during development or debugging, the URL encoder & decoder handles all the edge cases: it correctly distinguishes between %20 and +, handles full URLs vs parameter values, and shows both the raw and encoded forms side by side.

Common debugging workflows:

  1. Paste a URL that's failing. If it contains unencoded spaces or Unicode characters, that's your bug.
  2. Paste a parameter value. Toggle between %20 mode and + mode depending on whether you're building a query string or a path segment.
  3. Copy the encoded output and use it directly in your code or curl command.

FAQ

Why do some APIs accept both %20 and + for spaces?

Because they decode the query string using general URL parsing plus the application/x-www-form-urlencoded convention. Most web frameworks handle both automatically.

Should I use %20 or + in API query parameters?

Use %20. It's the RFC 3986 standard. The + convention is legacy form encoding. Modern APIs should parse both, but %20 is unambiguous.

Does encodeURIComponent encode the + sign?

Yes. encodeURIComponent("hello+world") returns "hello%2Bworld". The + is treated as a literal character that needs encoding.

What happens if I don't encode a URL?

The browser or HTTP client may reject the request, truncate the URL at the first space or special character, or interpret unencoded characters as structural elements (& becomes a parameter separator, # becomes a fragment identifier).

How do I encode a URL in the terminal?

curl --data-urlencode "query=hello world" https://api.example.com/search

The --data-urlencode flag handles encoding automatically. For manual encoding in scripts:

# Using Python
python3 -c "import urllib.parse; print(urllib.parse.quote('hello world'))"

Is URL encoding the same as HTML entity encoding?

No. URL encoding uses % followed by hex digits. HTML entity encoding uses & followed by a name or number (&amp;, &#160;). They serve different purposes and are not interchangeable.


When you're debugging a URL that won't parse or a query parameter that disappears on the server, the first thing to check is whether special characters have been properly encoded. The URL encoder & decoder gives you an immediate visual check — paste the problematic URL, and any unencoded characters are highlighted before you transform them.