Back to Repositories

Testing HTML Sanitization Filter Implementation in DevDocs

This test suite validates the HTML cleaning functionality in the DevDocs documentation system, focusing on sanitizing HTML content by removing unwanted elements and normalizing whitespace while preserving essential formatting.

Test Coverage Overview

The test suite provides comprehensive coverage for HTML cleaning operations.

Key areas tested include:
  • Removal of script and style tags
  • Comment elimination
  • Whitespace normalization
  • Preservation of formatting in pre/code blocks
  • Handling of invalid character sequences

Implementation Analysis

The implementation uses Minitest’s spec-style testing approach with FilterTestHelper integration. The tests employ a systematic pattern of setting up HTML content in @body, applying the filter, and asserting expected outcomes using string comparisons.

The testing structure leverages Ruby’s native testing capabilities while maintaining clean, readable test cases.

Technical Details

Testing tools and components:
  • Minitest framework for test execution
  • FilterTestHelper module for shared functionality
  • Nokogiri for HTML parsing
  • Custom filter_output_string method for processing
  • CleanHtmlFilter class as the system under test

Best Practices Demonstrated

The test suite exemplifies several testing best practices including isolated test cases, clear naming conventions, and comprehensive edge case coverage.

Notable practices include:
  • Single responsibility per test case
  • Consistent setup and teardown patterns
  • Explicit test descriptions
  • Coverage of both normal and edge cases

freecodecamp/devdocs

test/lib/docs/filters/core/clean_html_test.rb

            
require_relative '../../../../test_helper'
require_relative '../../../../../lib/docs'

class CleanHtmlFilterTest < Minitest::Spec
  include FilterTestHelper
  self.filter_class = Docs::CleanHtmlFilter

  it "removes <script> and <style>" do
    @body = '<div><script></script><style></style></div>'
    assert_equal '<div></div>', filter_output_string
  end

  it "removes comments" do
    @body = '<!-- test --><div>Test<!-- test --></div>'
    assert_equal '<div>Test</div>', filter_output_string
  end

  it "removes extraneous whitespace" do
    @body = "<p> \nTest <b></b> \n</p> \n<div>\r</div>\n\n "
    assert_equal '<p> Test <b></b> </p> <div> </div> ', filter_output_string
  end

  it "doesn't remove whitespace from <pre> and <code> nodes" do
    @body = "<pre> \nTest\r </pre><code> \nTest </code>"
    assert_equal @body, filter_output_string
  end

  it "doesn't remove invalid strings" do
    @body = Nokogiri::HTML.parse "\x92"
    assert_equal @body.to_s, filter_output_string
  end
end