Regex Cheatsheet: From Syntax Classification to Search Filtering#

Regular expressions are essential for developers, but the syntax is complex and hard to memorize. A well-designed regex cheatsheet helps you find the syntax you need instantly. This article shares the implementation of JsonKit Regex Cheatsheet, from syntax classification to real-time search filtering.

Why You Need a Regex Cheatsheet#

Regex has hundreds of metacharacters, quantifiers, and assertions. It’s hard to remember them all. For instance, \b is a word boundary while \B is a non-word boundary; (?=abc) is a positive lookahead while (?!abc) is a negative lookahead—these details slip away when you’re not using them.

A cheatsheet’s value lies in organized syntax classification + copyable examples. It’s not about memorizing syntax, but finding it quickly when you need it.

Syntax Classification Design#

Regex syntax can be organized into 8 categories:

1. Character Classes#

Patterns matching specific character sets:

  • . - Any single character (except newline)
  • \d / \D - Digit / Non-digit
  • \w / \W - Word character / Non-word character
  • \s / \S - Whitespace / Non-whitespace
  • [abc] / [^abc] - Character set / Negated set

2. Anchors#

Position matching without consuming characters:

  • ^ / $ - Start / End of line
  • \b / \B - Word boundary / Non-word boundary

3. Quantifiers#

Control match repetition:

  • * / + / ? - Zero or more / One or more / Zero or one
  • {n} / {n,} / {n,m} - Exact / At least / Range
  • *? / +? - Lazy (non-greedy) mode

4. Groups and Assertions#

  • (abc) / (?:abc) - Capturing / Non-capturing group
  • (?=abc) / (?!abc) - Positive / Negative lookahead
  • (?<=abc) / (?<!abc) - Positive / Negative lookbehind

5. Flags#

  • i - Case insensitive
  • g - Global match
  • m - Multiline mode
  • s - Dot matches newline
  • u - Unicode mode

6. Common Patterns#

Pre-built practical regexes: email, URL, IP address, phone number, date format, etc.

Data Structure Design#

Define the regex pattern structure with TypeScript:

interface RegexPattern {
  category: string      // character, anchor, quantifier, etc.
  pattern: string       // Regex syntax: \d, \w, ^, $, etc.
  description: string   // Chinese description
  descriptionEn: string // English description
  example?: string      // Example: \d+ → 123
}

const regexPatterns: RegexPattern[] = [
  { 
    category: 'character', 
    pattern: '\\d', 
    description: '匹配数字 [0-9]', 
    descriptionEn: 'Digit [0-9]', 
    example: '\\d+ → 123' 
  },
  { 
    category: 'anchor', 
    pattern: '^', 
    description: '匹配行首', 
    descriptionEn: 'Start of line', 
    example: '^Hello → Hello world' 
  },
  // ... 100+ patterns
]

Design decisions:

  • The category field enables filtering by syntax type
  • Bilingual descriptions support i18n out of the box
  • The example field provides real matches, more intuitive than abstract descriptions

Search Filtering Implementation#

When users type a search query, match against multiple fields:

const filteredPatterns = useMemo(() => {
  return regexPatterns.filter((pattern) => {
    // Category filter
    const matchesCategory = 
      selectedCategory === 'all' || 
      pattern.category === selectedCategory
    
    // Search filter (match pattern, description, descriptionEn)
    const matchesSearch = 
      searchQuery === '' ||
      pattern.pattern.toLowerCase().includes(searchQuery.toLowerCase()) ||
      pattern.description.toLowerCase().includes(searchQuery.toLowerCase()) ||
      pattern.descriptionEn.toLowerCase().includes(searchQuery.toLowerCase())
    
    return matchesCategory && matchesSearch
  })
}, [searchQuery, selectedCategory])

Optimizations:

  • Use useMemo to cache filtered results, avoid recomputing on every render
  • Case-insensitive search: \d or DIGIT both work
  • Multi-field matching improves hit rate

Category Filter Buttons#

State management for category buttons is straightforward:

const categories = [
  { id: 'all', name: '全部', nameEn: 'All' },
  { id: 'character', name: '字符类', nameEn: 'Character Classes' },
  { id: 'anchor', name: '锚点', nameEn: 'Anchors' },
  // ...
]

<div className="flex flex-wrap gap-2">
  {categories.map((cat) => (
    <button
      key={cat.id}
      onClick={() => setSelectedCategory(cat.id)}
      className={selectedCategory === cat.id 
        ? 'bg-accent-cyan text-white' 
        : 'bg-bg-secondary text-text-secondary'
      }
    >
      {locale === 'zh' ? cat.name : cat.nameEn}
    </button>
  ))}
</div>

UX details: Selected state highlighted, unselected in muted color; gap-2 spacing prevents accidental clicks.

One-Click Copy#

Click the copy button to write the regex syntax to clipboard:

const copyToClipboard = async (text: string, index: number) => {
  try {
    await navigator.clipboard.writeText(text)
    setCopiedIndex(index)
    setTimeout(() => setCopiedIndex(null), 2000)
  } catch (err) {
    console.error('Failed to copy:', err)
  }
}

User experience: Show “Copied” feedback for 2 seconds after successful copy; handle clipboard permission errors with try-catch.

Common Regex Patterns#

The cheatsheet includes 15+ practical patterns covering real-world scenarios:

// Email address
{ pattern: '^[\\w-]+@[\\w-]+\\.[\\w-]+$', description: 'Email address' }

// URL
{ pattern: '^https?://[^\\s]+$', description: 'URL' }

// IPv4 address
{ pattern: '^\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}$', description: 'IP address' }

// Date YYYY-MM-DD
{ pattern: '^\\d{4}-\\d{2}-\\d{2}$', description: 'Date (YYYY-MM-DD)' }

// Hex color
{ pattern: '^#[0-9A-Fa-f]{6}$', description: 'Hex color' }

// CamelCase
{ pattern: '^[A-Z][a-z]+([A-Z][a-z]+)*$', description: 'CamelCase' }

Copy these patterns directly into your projects, saving debugging time.

Performance and Extensibility#

Performance: 100+ patterns with useMemo caching, filtering completes in milliseconds—no virtual scrolling needed.

Extensibility: Add new patterns by appending to the regexPatterns array. Category buttons and filtering logic adapt automatically. For example, adding Unicode patterns:

{ 
  category: 'common', 
  pattern: '\\p{Script=Han}', 
  description: '中文字符', 
  descriptionEn: 'Chinese characters', 
  example: '\\p{Script=Han}+ → 你好世界' 
}

Practical Value#

A regex cheatsheet isn’t about memorizing syntax—it’s about:

  1. Quick lookup: Forgot what \b means? Search “boundary” and find it instantly
  2. Copy and use: Email regex, URL regex—copy directly to your project
  3. Learning reference: Understand syntax through examples, more intuitive than docs

Next time you write regex, open JsonKit Regex Cheatsheet instead of digging through MDN docs.


Related Tools: