Back to Repositories

Validating ST Language Tokenization in monaco-editor

This test suite validates the tokenization functionality for the Structured Text (ST) programming language in the Monaco Editor. It ensures proper syntax highlighting and token classification for ST language elements including variables, keywords, numbers, and comments.

Test Coverage Overview

The test suite provides comprehensive coverage for ST language tokenization:

  • Variable declarations with different data types (BOOL)
  • Address declarations using AT keyword
  • Conditional statements (IF-THEN)
  • Timer function blocks (TON)
  • Various number formats (decimal, binary, hexadecimal, float)

Implementation Analysis

The testing approach uses a structured pattern for tokenization verification. Each test case defines an input line and expected token classifications with precise start indices and types. The implementation leverages the testTokenization helper function to validate token boundaries and types systematically.

Technical Details

Testing tools and configuration:

  • Custom testRunner utility for tokenization testing
  • Token classification system for ST language elements
  • Support for multiple token types: identifier, keyword, comment, number, delimiter
  • Precise token boundary tracking with startIndex property

Best Practices Demonstrated

The test suite exhibits several testing best practices:

  • Comprehensive coverage of language elements
  • Explicit test cases for different number formats
  • Clear separation of test cases by functionality
  • Detailed token boundary verification
  • Proper handling of whitespace and comments

microsoft/monaco-editor

src/basic-languages/st/st.test.ts

            
/*---------------------------------------------------------------------------------------------
 *  Copyright (c) Microsoft Corporation. All rights reserved.
 *  Licensed under the MIT License. See License.txt in the project root for license information.
 *--------------------------------------------------------------------------------------------*/

import { testTokenization } from '../test/testRunner';
testTokenization('st', [
	[
		{
			line: 'xVar : BOOL;',
			tokens: [
				{ startIndex: 0, type: 'identifier.st' },
				{ startIndex: 4, type: 'white.st' },
				{ startIndex: 5, type: '' },
				{ startIndex: 6, type: 'white.st' },
				{ startIndex: 7, type: 'type.st' },
				{ startIndex: 11, type: 'delimiter.st' }
			]
		}
	],
	[
		{
			line: 'xStart AT %IX0.0.1: BOOL := TRUE;',
			tokens: [
				{ startIndex: 0, type: 'identifier.st' },
				{ startIndex: 6, type: 'white.st' },
				{ startIndex: 7, type: 'keyword.st' },
				{ startIndex: 9, type: 'white.st' },
				{ startIndex: 10, type: 'tag.st' },
				{ startIndex: 18, type: '' },
				{ startIndex: 19, type: 'white.st' },
				{ startIndex: 20, type: 'type.st' },
				{ startIndex: 24, type: 'white.st' },
				{ startIndex: 25, type: '' },
				{ startIndex: 27, type: 'white.st' },
				{ startIndex: 28, type: 'constant.st' },
				{ startIndex: 32, type: 'delimiter.st' }
			]
		}
	],
	[
		{
			line: "IF a > 2#0000_0110 THEN (* Something ' happens *)",
			tokens: [
				{ startIndex: 0, type: 'keyword.st' },
				{ startIndex: 2, type: 'white.st' },
				{ startIndex: 3, type: 'identifier.st' },
				{ startIndex: 4, type: 'white.st' },
				{ startIndex: 5, type: '' },
				{ startIndex: 6, type: 'white.st' },
				{ startIndex: 7, type: 'number.binary.st' },
				{ startIndex: 18, type: 'white.st' },
				{ startIndex: 19, type: 'keyword.st' },
				{ startIndex: 23, type: 'white.st' },
				{ startIndex: 24, type: 'comment.st' }
			]
		}
	],
	[
		{
			line: 'TON1(IN := TRUE, PT := T#20ms, Q => xStart); // Run timer',
			tokens: [
				{ startIndex: 0, type: 'identifier.st' },
				{ startIndex: 4, type: 'delimiter.parenthesis.st' },
				{ startIndex: 5, type: 'identifier.st' },
				{ startIndex: 7, type: 'white.st' },
				{ startIndex: 8, type: '' },
				{ startIndex: 10, type: 'white.st' },
				{ startIndex: 11, type: 'constant.st' },
				{ startIndex: 15, type: '' },
				{ startIndex: 16, type: 'white.st' },
				{ startIndex: 17, type: 'identifier.st' },
				{ startIndex: 19, type: 'white.st' },
				{ startIndex: 20, type: '' },
				{ startIndex: 22, type: 'white.st' },
				{ startIndex: 23, type: 'tag.st' },
				{ startIndex: 29, type: '' },
				{ startIndex: 30, type: 'white.st' },
				{ startIndex: 31, type: 'identifier.st' },
				{ startIndex: 32, type: 'white.st' },
				{ startIndex: 33, type: '' },
				{ startIndex: 35, type: 'white.st' },
				{ startIndex: 36, type: 'identifier.st' },
				{ startIndex: 42, type: 'delimiter.parenthesis.st' },
				{ startIndex: 43, type: 'delimiter.st' },
				{ startIndex: 44, type: 'white.st' },
				{ startIndex: 45, type: 'comment.st' }
			]
		}
	],
	// Numbers
	[
		{
			line: '0',
			tokens: [{ startIndex: 0, type: 'number.st' }]
		}
	],

	[
		{
			line: '0.0',
			tokens: [{ startIndex: 0, type: 'number.float.st' }]
		}
	],

	[
		{
			line: '2#000_0101',
			tokens: [{ startIndex: 0, type: 'number.binary.st' }]
		}
	],
	[
		{
			line: '16#0f',
			tokens: [{ startIndex: 0, type: 'number.hex.st' }]
		}
	],

	[
		{
			line: '23.5',
			tokens: [{ startIndex: 0, type: 'number.float.st' }]
		}
	],

	[
		{
			line: '23.5e3',
			tokens: [{ startIndex: 0, type: 'number.float.st' }]
		}
	],

	[
		{
			line: '23.5E3',
			tokens: [{ startIndex: 0, type: 'number.float.st' }]
		}
	]
]);