Daily DAX : Day 203 UNICODE

The UNICODE function in Power BI DAX returns the Unicode code point (numeric value) of the first character in a given text string. It is useful for text analysis, data cleansing, or scenarios where you need to work with character encodings.

Syntax

UNICODE(<text>)

<text>: A string or a column containing text. The function evaluates only the first character of the input.

Return Value

An integer representing the Unicode code point of the first character.

If the input is empty or null, it returns a null value.

Use Case

The UNICODE function is typically used in:

Text Analysis: To identify or compare characters based on their Unicode values (e.g., distinguishing between similar-looking characters like "A" and "А").

Data Validation/Cleansing: To detect special characters, non-printable characters, or specific symbols in a dataset.

Conditional Formatting: To apply rules based on character types (e.g., filtering out rows starting with certain characters).

Integration with Other Systems: When mapping or transforming text data for systems that rely on Unicode standards.

Example

Suppose you have a column [ProductName] in a table, and you want to find the Unicode value of the first character of each product name.

DAX

UnicodeValue = UNICODE([ProductName])

Sample Data

ProductName UnicodeValue

Apple 65

Banana 66

Çherry 199

123Item 49

For "Apple", the first character is "A", and UNICODE("A") returns 65 (Unicode code point for "A").

For "Çherry", the first character is "Ç", and UNICODE("Ç") returns 199.

Practical Scenario

Use Case: Detecting special characters in a dataset.

You can create a calculated column to flag rows where the first character is a non-standard letter (e.g., Unicode value outside the range for A-Z, 65–90).

DAX

SpecialCharFlag =

IF(

UNICODE([ProductName]) < 65 || UNICODE([ProductName]) > 90,

"Special/Non-Letter",

"Standard Letter"

)

This helps identify entries starting with numbers, symbols, or non-ASCII characters for further data cleansing.

Notes

The function only evaluates the first character. To analyze other characters, you may need to combine it with functions like MID or LEFT.

UNICODE is case-sensitive: UNICODE("A") (65) differs from UNICODE("a") (97).

It aligns with the Unicode standard, making it reliable for internationalization tasks.

For more complex text processing, combine UNICODE with other DAX functions like FIND, SEARCH, or SUBSTITUTE.

Search This Blog

What the DAX?

Daily DAX : Day 203 UNICODE

Comments

Post a Comment

Popular posts from this blog

Daily DAX : Day 65 INFO.TABLEPERMISSIONS

Daily DAX : Day 55 PV

Daily DAX : Day 50 STDEVX.P