Back to Repositories

Testing Liquid Tokenization Implementation in Shopify/liquid

This test suite validates the tokenization functionality in Liquid, a template language by Shopify. It ensures proper parsing of strings, variables, and template blocks while maintaining accurate line number tracking.

Test Coverage Overview

The test suite provides comprehensive coverage of Liquid’s tokenization capabilities.

Key areas tested include:

Basic string tokenization
Variable token parsing with {{ }} syntax
Block token handling with {% %} syntax
Line number calculation and tracking
Edge cases like nil input handling

Implementation Analysis

The testing approach uses Minitest framework with isolated unit tests for each tokenization feature. The implementation follows a systematic pattern of testing token generation, spacing handling, and line number tracking using private helper methods for tokenization operations.

The tests utilize custom tokenizer initialization and token extraction methods to validate the parsing behavior.

Technical Details

Testing tools and setup:

Minitest as the testing framework
Custom tokenizer class from Liquid::ParseContext
Private helper methods for token manipulation
Line number tracking functionality
Integration with Liquid’s parsing context

Best Practices Demonstrated

The test suite exemplifies several testing best practices:

Isolated test cases for specific functionality
Comprehensive edge case coverage
Clear test method naming conventions
Proper test helper organization
Effective use of assert statements

shopify/liquid

test/unit/tokenizer_unit_test.rb

            
# frozen_string_literal: true

require 'test_helper'

class TokenizerTest < Minitest::Test
  def test_tokenize_strings
    assert_equal([' '], tokenize(' '))
    assert_equal(['hello world'], tokenize('hello world'))
  end

  def test_tokenize_variables
    assert_equal(['{{funk}}'], tokenize('{{funk}}'))
    assert_equal([' ', '{{funk}}', ' '], tokenize(' {{funk}} '))
    assert_equal([' ', '{{funk}}', ' ', '{{so}}', ' ', '{{brother}}', ' '], tokenize(' {{funk}} {{so}} {{brother}} '))
    assert_equal([' ', '{{  funk  }}', ' '], tokenize(' {{  funk  }} '))
  end

  def test_tokenize_blocks
    assert_equal(['{%comment%}'], tokenize('{%comment%}'))
    assert_equal([' ', '{%comment%}', ' '], tokenize(' {%comment%} '))

    assert_equal([' ', '{%comment%}', ' ', '{%endcomment%}', ' '], tokenize(' {%comment%} {%endcomment%} '))
    assert_equal(['  ', '{% comment %}', ' ', '{% endcomment %}', ' '], tokenize("  {% comment %} {% endcomment %} "))
  end

  def test_calculate_line_numbers_per_token_with_profiling
    assert_equal([1],       tokenize_line_numbers("{{funk}}"))
    assert_equal([1, 1, 1], tokenize_line_numbers(" {{funk}} "))
    assert_equal([1, 2, 2], tokenize_line_numbers("\n{{funk}}\n"))
    assert_equal([1, 1, 3], tokenize_line_numbers(" {{\n funk \n}} "))
  end

  def test_tokenize_with_nil_source_returns_empty_array
    assert_equal([], tokenize(nil))
  end

  private

  def new_tokenizer(source, parse_context: Liquid::ParseContext.new, start_line_number: nil)
    parse_context.new_tokenizer(source, start_line_number: start_line_number)
  end

  def tokenize(source)
    tokenizer = new_tokenizer(source)
    tokens    = []
    # shift is private in Liquid::C::Tokenizer, since it is only for unit testing
    while (t = tokenizer.send(:shift))
      tokens << t
    end
    tokens
  end

  def tokenize_line_numbers(source)
    tokenizer    = new_tokenizer(source, start_line_number: 1)
    line_numbers = []
    loop do
      line_number = tokenizer.line_number
      if tokenizer.send(:shift)
        line_numbers << line_number
      else
        break
      end
    end
    line_numbers
  end
end