Code Like Clockwork - Elegant, dependable, efficient online services.

How to import / export comma-separated value (CSV) files to / from MySQL databases

04 Feb 2010

If you need to import data from CSV or other text-based files to a MySQL database or vice-versa, you can use the LOAD DATA or SELECT INTO OUTFILE statements explained below.

Examples - Importing files with "LOAD DATA"

A MySQL LOAD DATA statement can be used to load values from any text-based files (such as comma separated, tab separated, etc.) into database tables.

In the examples below, a CSV file with the following content is imported:

"column1","column2","column3"
1,"foo",10
2,"bar",20
3,"baz",30

... into a table with the following structure:

CREATE TABLE `test_table` (
`column1` INT NOT NULL,
`column2` VARCHAR(100) NOT NULL,
`column3` INT NOT NULL,
PRIMARY KEY (`column1`)
);

A simple import:

LOAD DATA LOCAL INFILE '/Users/willem/Documents/importfile.csv'
INTO TABLE test_table
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(column1, column2, column3);

An import with variables and arithmetic:

LOAD DATA LOCAL INFILE '/Users/willem/Documents/importfile.csv'
INTO TABLE test_table
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(column1, column2, @variable1)
SET column3 = @variable1 / 100;

An import using server variables:

LOAD DATA LOCAL INFILE '/Users/willem/Documents/importfile.csv'
INTO TABLE test_table
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(column1, column2)
SET column3 = CURRENT_TIMESTAMP;

An import using default values for columns:

LOAD DATA LOCAL INFILE '/Users/willem/Documents/importfile.csv'
INTO TABLE test_table
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(column1, @dummy, @dummy);

Notes:

  • If an input line contains too many fields, extra ones will be ignored. If an input line contains too few fields, the remaining columns will be set to their default values.
  • LOAD DATA statements regard all inputs as strings, so you cannot use number values for ENUM or SET columns, all values have to be specified as strings.

For more information, see the MySQL LOAD DATA Documentation.

Examples - Exporting files with "SELECT INTO OUTFILE"

A MySQL SELECT INTO OUTFILE statement can be used to export values from a database table to any type of text-based files (such as comma separated, tab separated, etc.).

In the examples below, a CSV file with the following content is exported:

1,"foo",10
2,"bar",20
3,"baz",30

... from a table with the following structure:

CREATE TABLE `test_table` (
`column1` INT NOT NULL,
`column2` VARCHAR(100) NOT NULL,
`column3` INT NOT NULL,
PRIMARY KEY (`column1`)
);

A simple export:

SELECT column1, column2, column3
FROM test_table
INTO OUTFILE 'exportfile.csv'
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';

To perform an export to a host other than the MySQL server, the "mysql -e" command should rather be used to execute a SELECT statement with concatenation and pipe the results into a file:

mysql -u {db_username} -p {db_schema} -e "select concat(column1, ',', column2, ',', column3) from test_table" > /Users/willem/Documents/exportfile.csv

Explanation - Importing files with "LOAD DATA"

The LOAD DATA syntax is:

LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE '/path/and/file.name'
    [REPLACE | IGNORE]
    INTO TABLE table_name
    [CHARACTER SET charset_name]
    [FIELDS
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'string']
        [ESCAPED BY 'string']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]
    [IGNORE number LINES]
    [(col_name_or_user_var, ...)]
    [SET col_name = expr, ...]

The required and optional switches and clauses work as follows:

LOW_PRIORITY:

When this switch is specified, execution of the statement will be delayed until no clients are reading from the table (this works only with storage engines that use table-level locking, e.g. MyISAM, MEMORY, and MERGE).

CONCURRENT:

When this switch is specified and the table data is being imported to uses the MyISAM table and contains no free blocks in the middle, other threads can retrieve data from table while LOAD_DATA is executing.

LOCAL:

When this switch is specified the input file will be read from the host the MySQL client is running on instead of the MySQL server. When this switch is not specified, the input file will be read from the data directory on the MySQL host. To use LOAD DATA without the LOCAL switch, the connected user must have the "FILE" privilege.

REPLACE:

When this switch is specified lines read from the input file will replace existing rows in the target table (based on the Primary Key).

IGNORE:

When this switch is specified lines read from the input file will be ignored if rows with the same Primary Keys exist in the target table.

By default, when neither the IGNORE or LOCAL switches are used, duplicate rows will cause an error to occur and the rest of the input file to be ignored.

By default, when the LOCAL switch is used IGNORE is specified automatically since there's no way for the server to stop transmission of the input file once it has started.

To ignore Foreign Key constraints during a LOAD DATA operation, you need to issue the following statement before executing LOAD DATA:

SET foreign_key_check = 0

FIELDS:

Both the FIELDS and LINES clauses are optional, but if both are specified FIELDS must precede LINES.

All of the FIELDS subclauses are optional, but if you specified the LINES clause you must specify at least one subclause.

The TERMINATED BY subclause allows you to specify the field terminating string (e.g. a comma in the case of CSV files).

The ENCLOSED BY subclause allows you to specify the field enclosing string (e.g. a double-quote in the case of CSV files where fields contain commas). If the OPTIONAL switch is specified, lines without fields enclosed by the ENCLOSED BY string won't be ignored.

The ESCAPED BY subclause allows you to specify the string escape character (e.g. if you needed to use "%" instead of "\").

If you specify no FIELDS clause, the defaults are:

FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'

LINES:

Both the FIELDS and LINES clauses are optional, but if both are specified FIELDS must precede LINES.

All of the LINES subclauses are optional, but if you specified the LINES clause you must specify at least one subclause.

The STARTING BY subclause allows you to specify the line starting string. This string (and anything preceding it) will be ignored during the LOAD DATA operation.

The TERMINATED BY subclause allows you to specify the line termination string. This string will be ignored during the LOAD DATA operation.

If you specify no LINES clause, the defaults are:

LINES TERMINATED BY '\n' STARTING BY ''

IGNORE {number} LINES:

When this clause is specified, the specified number of lines from the top of the input file will be ignored during the LOAD DATA operation (e.g. "IGNORE 1 LINES" will ignore a header line in a CSV file).

SET:

Assignments in the SET clause should only have column names on the left-hand side of the assignment operators, but column, variables, arithmetic, subqueries, etc. can be used on the right-hand side. (An exception to this is that you cannot use subqueries from the table that is being loaded).

Explanation - Exporting files with "SELECT INTO OUTFILE"

The SELECT INTO OUTFILE syntax is similar to a normal SELECT statement, with the exception that the INTO OUTFILE clause and its subclauses are added at the end:

SELECT
    [ALL | DISTINCT | DISTINCTROW ]
      [HIGH_PRIORITY]
      [STRAIGHT_JOIN]
      [SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
      [SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
    select_expr [, select_expr ...]
    [FROM table_references
    [WHERE where_condition]
    [GROUP BY {col_name | expr | position}
      [ASC | DESC], ... [WITH ROLLUP]]
    [HAVING where_condition]
    [ORDER BY {col_name | expr | position}
      [ASC | DESC], ...]
    [LIMIT {[offset,] row_count | row_count OFFSET offset}]
    [PROCEDURE procedure_name(argument_list)]
    [INTO OUTFILE 'file_name'
        [CHARACTER SET charset_name]
        export_options
      | INTO DUMPFILE 'file_name'
      | INTO var_name [, var_name]]
    [FOR UPDATE | LOCK IN SHARE MODE]]

With SELECT INTO OUTFILE the output file is always created on the MySQL server host, so the user executing the statement must have FILE privileges. Additionally, output files can never overwrite existing files.

The syntax for SELECT INTO OUTFILE's "export_options" consists of the same FIELDS and LINES clauses that are used with the LOAD DATA INFILE statement.

For more information, see the MySQL SELECT Documentation.

Do you like this? Share it:

Copyright © Geekology 2012. All Rights Reserved.