The easiest way to see to the content of your CSV file is to provide file URL to OPENROWSET function, specify csv FORMAT, and 2.0
Sample:
SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'URL/FILE.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0',
HEADER_ROW = TRUE
) AS [result]
Problem is you may get this:SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'URL/FILE.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0',
HEADER_ROW = TRUE
) AS [result]
Potential conversion error while reading VARCHAR column ... Change database collation to a UTF8 collation or specify explicit column schema in WITH clause and assign UTF8 collation to VARCHAR columns ...
Always use UTF-8 collations to read UTF-8 text in serverless SQL pool
Synapse serverless SQL pool is a query engine that enables you to query a variety of files and formats
Just now, synapse SQL forces conversion of UTF-8 characters to plain VARCHAR characters if UTF-8 collation is not assigned to VARCHAR type.
How to fix the problem?
Anwer is in Error message: Change database collation to a UTF8 collation or specify explicit column schema in WITH clause and assign UTF8 collation to VARCHAR columns.
We have not a table here so use WITH clause.
SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'URL/FILE.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0',
HEADER_ROW = TRUE
)WITH (
Category VARCHAR(254) COLLATE Latin1_General_100_BIN2_UTF8
)AS [result]
Explanation:
Collation in Azure Synapse refers to the way data is sorted and compared in databases. It determines how data is sorted and compared based on character encoding and language-specific rules.
In Azure Synapse, collation settings can be configured at the database, column, or expression level to control how sorting and comparison operations are performed on character data. Collation settings affect operations such as sorting, searching, and comparing strings in SQL statements.
Azure Synapse supports various collation options such as case sensitivity, accent sensitivity, and binary sorting, and provides a default collation for each database. By choosing the
appropriate collation settings, users can ensure that their database operations correctly sort and compare strings in a way that matches their requirements.