Daily DAX : Day 203 UNICODE

 The UNICODE function in Power BI DAX returns the Unicode code point (numeric value) of the first character in a given text string. It is useful for text analysis, data cleansing, or scenarios where you need to work with character encodings.

Syntax


UNICODE(<text>)


    <text>: A string or a column containing text. The function evaluates only the first character of the input.


Return Value


    An integer representing the Unicode code point of the first character.

    If the input is empty or null, it returns a null value.


Use Case

The UNICODE function is typically used in:


    Text Analysis: To identify or compare characters based on their Unicode values (e.g., distinguishing between similar-looking characters like "A" and "А").

    Data Validation/Cleansing: To detect special characters, non-printable characters, or specific symbols in a dataset.

    Conditional Formatting: To apply rules based on character types (e.g., filtering out rows starting with certain characters).

    Integration with Other Systems: When mapping or transforming text data for systems that rely on Unicode standards.


Example

Suppose you have a column [ProductName] in a table, and you want to find the Unicode value of the first character of each product name.

DAX


UnicodeValue = UNICODE([ProductName])


Sample Data            

ProductName                UnicodeValue

Apple                                  65 

Banana                                66                          

Çherry                                 199

123Item                               49


    For "Apple", the first character is "A", and UNICODE("A") returns 65 (Unicode code point for "A").

    For "Çherry", the first character is "Ç", and UNICODE("Ç") returns 199.


Practical Scenario

Use Case: Detecting special characters in a dataset.

You can create a calculated column to flag rows where the first character is a non-standard letter (e.g., Unicode value outside the range for A-Z, 65–90).

DAX


SpecialCharFlag = 

IF(

    UNICODE([ProductName]) < 65 || UNICODE([ProductName]) > 90,

    "Special/Non-Letter",

    "Standard Letter"

)


This helps identify entries starting with numbers, symbols, or non-ASCII characters for further data cleansing.

Notes


    The function only evaluates the first character. To analyze other characters, you may need to combine it with functions like MID or LEFT.

    UNICODE is case-sensitive: UNICODE("A") (65) differs from UNICODE("a") (97).

    It aligns with the Unicode standard, making it reliable for internationalization tasks.


For more complex text processing, combine UNICODE with other DAX functions like FIND, SEARCH, or SUBSTITUTE.

Comments

Popular posts from this blog

Daily DAX : Day 65 INFO.TABLEPERMISSIONS

Daily DAX : Day 55 PV