Как csv конвертировать в excel python

Improve Article

Save Article

Like Article

Read

Discuss

Improve Article

Save Article

Like Article

Pandas can read, filter, and re-arrange small and large datasets and output them in a range of formats including Excel. In this article, we will be dealing with the conversion of .csv file into excel (.xlsx).
Pandas provide the ExcelWriter class for writing data frame objects to excel sheets.
Syntax:

final = pd.ExcelWriter('GFG.xlsx')

Example:
Sample CSV File:

Python3

import pandas as pd

df_new = pd.read_csv('Names.csv')

GFG = pd.ExcelWriter('Names.xlsx')

df_new.to_excel(GFG, index=False)

GFG.save()

Output:

Method 2:

The read_* functions are used to read data to pandas, the to_* methods are used to store data. The to_excel() method stores the data as an excel file. In the example here, the sheet_name is named passengers instead of the default Sheet1. By setting index=False the row index labels are not saved in the spreadsheet.

Python3

import pandas as pd

df = pd.read_csv("./weather_data.csv")

df.to_excel("weather.xlsx", sheet_name="Testing", index=False)

Like Article

Save Article

Источник

In this post there is a Python example to convert from csv to xls.

However, my file has more than 65536 rows so xls does not work. If I name the file xlsx it doesnt make a difference. Is there a Python package to convert to xlsx?

asked Jul 15, 2013 at 10:21

Here’s an example using xlsxwriter:

import os
import glob
import csv
from xlsxwriter.workbook import Workbook


for csvfile in glob.glob(os.path.join('.', '*.csv')):
    workbook = Workbook(csvfile[:-4] + '.xlsx')
    worksheet = workbook.add_worksheet()
    with open(csvfile, 'rt', encoding='utf8') as f:
        reader = csv.reader(f)
        for r, row in enumerate(reader):
            for c, col in enumerate(row):
                worksheet.write(r, c, col)
    workbook.close()

FYI, there is also a package called openpyxl, that can read/write Excel 2007 xlsx/xlsm files.

user

11.1k6 gold badges23 silver badges83 bronze badges

answered Jul 16, 2013 at 18:51

alecxealecxe

458k119 gold badges1069 silver badges1182 bronze badges

With my library pyexcel,

 $ pip install pyexcel pyexcel-xlsx

you can do it in one command line:

from pyexcel.cookbook import merge_all_to_a_book
# import pyexcel.ext.xlsx # no longer required if you use pyexcel >= 0.2.2 
import glob


merge_all_to_a_book(glob.glob("your_csv_directory/*.csv"), "output.xlsx")

Each csv will have its own sheet and the name will be their file name.

answered Oct 19, 2014 at 23:42

chfwchfw

4,4822 gold badges27 silver badges31 bronze badges

Simple two line code solution using pandas

  import pandas as pd

  read_file = pd.read_csv ('File name.csv')
  read_file.to_excel ('File name.xlsx', index = None, header=True)

answered Nov 16, 2019 at 23:11

Bhanu SinhaBhanu Sinha

1,51612 silver badges10 bronze badges

First install openpyxl:

pip install openpyxl

Then:

from openpyxl import Workbook
import csv


wb = Workbook()
ws = wb.active
with open('test.csv', 'r') as f:
    for row in csv.reader(f):
        ws.append(row)
wb.save('name.xlsx')

Paolo

19.6k21 gold badges75 silver badges113 bronze badges

answered Mar 9, 2017 at 19:07

zhuhurenzhuhuren

3274 silver badges7 bronze badges

Adding an answer that exclusively uses the pandas library to read in a .csv file and save as a .xlsx file. This example makes use of pandas.read_csv (Link to docs) and pandas.dataframe.to_excel (Link to docs).

The fully reproducible example uses numpy to generate random numbers only, and this can be removed if you would like to use your own .csv file.

import pandas as pd
import numpy as np

# Creating a dataframe and saving as test.csv in current directory
df = pd.DataFrame(np.random.randn(100000, 3), columns=list('ABC'))
df.to_csv('test.csv', index = False)

# Reading in test.csv and saving as test.xlsx

df_new = pd.read_csv('test.csv')
writer = pd.ExcelWriter('test.xlsx')
df_new.to_excel(writer, index = False)
writer.save()

answered Dec 29, 2017 at 17:19

patrickjlong1patrickjlong1

3,6431 gold badge18 silver badges32 bronze badges

Simple 1-to-1 CSV to XLSX file conversion without enumerating/looping through the rows:

import pyexcel

sheet = pyexcel.get_sheet(file_name="myFile.csv", delimiter=",")
sheet.save_as("myFile.xlsx")

Notes:

I have found that if the file_name is really long (>30 characters excluding path)
then the resultant XLSX file will throw an error when Excel tries
to load it. Excel will offer to fix the error which it does, but it
is frustrating.
There is a great answer previously provided that
combines all of the CSV files in a directory into one XLSX workbook,
which fits a different use case than just trying to do a 1-to-1 CSV file to
XLSX file conversion.

answered Apr 8, 2020 at 20:16

Larry WLarry W

1011 silver badge5 bronze badges

How I do it with openpyxl lib:

import csv
from openpyxl import Workbook

def convert_csv_to_xlsx(self):
    wb = Workbook()
    sheet = wb.active

    CSV_SEPARATOR = "#"

    with open("my_file.csv") as f:
        reader = csv.reader(f)
        for r, row in enumerate(reader):
            for c, col in enumerate(row):
                for idx, val in enumerate(col.split(CSV_SEPARATOR)):
                    cell = sheet.cell(row=r+1, column=idx+1)
                    cell.value = val

    wb.save("my_file.xlsx")

mcarton

26.8k5 gold badges82 silver badges92 bronze badges

answered Aug 17, 2016 at 16:58

RubyconRubycon

18.1k10 gold badges49 silver badges70 bronze badges

There is a simple way

import os
import csv
import sys

from openpyxl import Workbook

reload(sys)
sys.setdefaultencoding('utf8')

if __name__ == '__main__':
    workbook = Workbook()
    worksheet = workbook.active
    with open('input.csv', 'r') as f:
        reader = csv.reader(f)
        for r, row in enumerate(reader):
            for c, col in enumerate(row):
                for idx, val in enumerate(col.split(',')):
                    cell = worksheet.cell(row=r+1, column=c+1)
                    cell.value = val
    workbook.save('output.xlsx')

answered May 5, 2017 at 2:23

David DingDavid Ding

1,4331 gold badge15 silver badges13 bronze badges

Источник

There are many common file types that you will need to work with as a software developer. One such format is the CSV file. CSV stands for “Comma-Separated Values” and is a text file format that uses a comma as a delimiter to separate values from one another. Each row is its own record and each value is its own field. Most CSV files have records that are all the same length.

Microsoft Excel opens CSV files with no problem. You can open one yourself with Excel and then save it yourself in an Excel format. The purpose of this article is to teach you the following concepts:

Converting a CSV file to Excel
Converting an Excel spreadsheet to CSV

You will be using Python and OpenPyXL to do the conversion from one file type to the other.

Getting Started

You need to install OpenPyXL to be able to use the examples in this article. You can use pip to install OpenPyXL:

python3 -m pip install openpyxl

Now that you have OpenPyXL, you are ready to learn how to convert a CSV file to an Excel spreadsheet!

You will soon see that converting a CSV file to an Excel spreadsheet doesn’t take very much code. However, you do need to have a CSV file to get started. With that in mind, open up your favorite text editor (Notepad, SublimeText, or something else) and add the following:

book_title,author,publisher,pub_date,isbn
Python 101,Mike Driscoll, Mike Driscoll,2020,123456789
wxPython Recipes,Mike Driscoll,Apress,2018,978-1-4842-3237-8
Python Interviews,Mike Driscoll,Packt Publishing,2018,9781788399081

Save this file as books.txt. You can also download the CSV file from this book’s GitHub code repository.

Now that you have the CSV file, you need to create a new Python file too. Open up your Python IDE and create a new file named csv_to_excel.py. Then enter the following code:

# csv_to_excel.py

import csv
import openpyxl


def csv_to_excel(csv_file, excel_file):
    csv_data = []
    with open(csv_file) as file_obj:
        reader = csv.reader(file_obj)
        for row in reader:
            csv_data.append(row)

    workbook = openpyxl.Workbook()
    sheet = workbook.active
    for row in csv_data:
        sheet.append(row)
    workbook.save(excel_file)


if __name__ == "__main__":
    csv_to_excel("books.csv", "books.xlsx")

Your code uses Python’s csv module in addition to OpenPyXL. You create a function, csv_to_excel(), then accepts two arguments:

csv_file – The path to the input CSV file
excel_file – The path to the Excel file that you want to create

You want to extract each row of data from the CSV. To extract the data, you create an csv.reader() object and then iterate over one row at a time. For each iteration, you append the row to csv_data. A row is a list of strings.

The next step of the process is to create the Excel spreadsheet. To add data to your Workbook, you iterate over each row in csv_data and append() them to your Worksheet. Finally, you save the Excel spreadsheet.

When you run this code, you will have an Excel spreadsheet that looks like this:

CSV to Excel Spreadsheet

You are now able to convert a CSV file to an Excel spreadsheet in less than twenty-five lines of code!

Now you are ready to learn how to convert an Excel spreadsheet to a CSV file!

Converting an Excel Spreadsheet to CSV

Converting an Excel spreadsheet to a CSV file can be useful if you need other processes to consume the data. Another potential need for a CSV file is when you need to share your Excel spreadsheet with someone who doesn’t have a spreadsheet program to open it. While rare, this may happen.

You can convert an Excel spreadsheet to a CSV file using Python. Create a new file named excel_to_csv.py and add the following code:

# excel_to_csv.py

import csv
import openpyxl

from openpyxl import load_workbook


def excel_to_csv(excel_file, csv_file):
    workbook = load_workbook(filename=excel_file)
    sheet = workbook.active
    csv_data = []
    
    # Read data from Excel
    for value in sheet.iter_rows(values_only=True):
        csv_data.append(list(value))

    # Write to CSV
    with open(csv_file, 'w') as csv_file_obj:
        writer = csv.writer(csv_file_obj, delimiter=',')
        for line in csv_data:
            writer.writerow(line)


if __name__ == "__main__":
    excel_to_csv("books.xlsx", "new_books.csv")

Once again you only need the csv and openpyxl modules to do the conversion. This time, you load the Excel spreadsheet first and iterate over the Worksheet using the iter_rows method. The value you receive in each iteration of iter_tools is a list of strings. You append the list of strings to csv_data.

The next step is to create a csv.writer(). Then you iterate over each list of strings in csv_data and call writerow() to add it to your CSV file.

Once your code finishes, you will have a brand new CSV file!

Wrapping Up

Converting a CSV file to an Excel spreadsheet is easy to do with Python. It’s a useful tool that you can use to take in data from your clients or other data sources and transform it into something that you can present to your company.

You can apply cell styling to the data as you write it to your Worksheet too. By applying cell styling, you can make your data stand out with different fonts or background row colors.

Try this code out on your own Excel or CSV files and see what you can do.

Problem Formulation

💡 Challenge: Given a CSV file. How to convert it to an excel file in Python?

We create a folder with two files, the file csv_to_excel.py and my_file.csv. We want to convert the CSV file to an excel file so that after running the script csv_to_excel.py, we obtain the third file my_file.csv in our folder like so:

All methods discussed in this tutorial show different code snippets to put into csv_to_excel.py so that it converts the CSV to XLSX in Python.

Method 1: 5 Easy Steps in Pandas

The most pythonic way to convert a .csv to an .xlsx (Excel) in Python is to use the Pandas library.

Install the pandas library with pip install pandas
Install the openpyxl library that is used internally by pandas with pip install openpyxl
Import the pandas libray with import pandas as pd
Read the CSV file into a DataFrame df by using the expression df = pd.read_csv('my_file.csv')
Store the DataFrame in an Excel file by calling df.to_excel('my_file.xlsx', index=None, header=True)

import pandas as pd


df = pd.read_csv('my_file.csv')
df.to_excel('my_file.xlsx', index=None, header=True)

Note that there are many ways to customize the to_excel() function in case

you don’t need a header line,
you want to fix the first line in the Excel file,
you want to format the cells as numbers instead of strings, or
you have an index column in the original CSV and want to consider it in the Excel file too.

If you want to do any of those, feel free to read our full guide on the Finxter blog here:

🌍 Tutorial: Pandas DataFrame.to_excel() – An Unofficial Guide to Saving Data to Excel

Also, we’ve recorded a video on the ins and outs of this method here:

pd.to_excel() – An Unofficial Guide to Saving Data to Excel

Let’s have a look at an alternative to converting a CSV to an Excel file in Python:

Method 2: Modules csv and openpyxl

To convert a CSV to an Excel file, you can also use the following approach:

Import the csv module
Import the openpyxl module
Read the CSV file into a list of lists, one inner list per row, by using the csv.reader() function
Write the list of lists to the Excel file by using the workbook representation of the openpyxl library.
Get the active worksheet by calling workbook.active
Write to the worksheet by calling worksheet.append(row) and append one list of values, one value per cell.

The following function converts a given CSV to an Excel file:

import csv
import openpyxl


def csv_to_excel(csv_filename, excel_filename):

    # Read CSV file
    csv_data = []
    with open(csv_filename) as f:
        csv_data = [row for row in csv.reader(f)]
    
    # Write to Excel file
    workbook = openpyxl.workbook.Workbook()
    worksheet = workbook.active
    for row in csv_data:
        worksheet.append(row)
    workbook.save(excel_filename)


if __name__ == "__main__":
    csv_to_excel("my_file.csv", "my_file.xlsx")

This is a bit more fine-granular approach and it allows you to modify each row in the code or even write additional details into the Excel worksheet.

More Python CSV Conversions

🐍 Learn More: I have compiled an “ultimate guide” on the Finxter blog that shows you the best method, respectively, to convert a CSV file to JSON, Excel, dictionary, Parquet, list, list of lists, list of tuples, text file, DataFrame, XML, NumPy array, and list of dictionaries.

Where to Go From Here?

Enough theory. Let’s get some practice!

Coders get paid six figures and more because they can solve problems more effectively using machine intelligence and automation.

To become more successful in coding, solve more real problems for real people. That’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

You build high-value coding skills by working on practical coding projects!

Do you want to stop learning with toy projects and focus on practical code projects that earn you money and solve real problems for people?

🚀 If your answer is YES!, consider becoming a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

If you just want to learn about the freelancing opportunity, feel free to watch my free webinar “How to Build Your High-Income Skill Python” and learn how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

While working as a researcher in distributed systems, Dr. Christian Mayer found his love for teaching computer science students.

To help students reach higher levels of Python success, he founded the programming education website Finxter.com that has taught exponential skills to millions of coders worldwide. He’s the author of the best-selling programming books Python One-Liners (NoStarch 2020), The Art of Clean Code (NoStarch 2022), and The Book of Dash (NoStarch 2022). Chris also coauthored the Coffee Break Python series of self-published books. He’s a computer science enthusiast, freelancer, and owner of one of the top 10 largest Python blogs worldwide.

His passions are writing, reading, and coding. But his greatest passion is to serve aspiring coders through Finxter and help them to boost their skills. You can join his free email academy here.

Источник

Есть код:

def csv_to_xlsx(filename):
    import pandas as pd
    import os
    pd.read_csv('xlsx/' + filename + '.csv', sep=",", encoding="cp1251").to_excel('xlsx/' + filename + '.xlsx', index=None)
    os.remove('xlsx/' + filename + '.csv')

Выдаётся ошибка:

Traceback (most recent call last):
File «C:UsersPycharmProjectsxldrommain.py», line 89, in
get_mail()
File «C:UsersPycharmProjectsxldrommain.py», line 46, in get_mail
mails = get_file(‘febest’, mails, part, ms[c])
File «C:UsersPycharmProjectsxldrommain.py», line 68, in get_file
csv_to_xlsx(company + str(c))
File «C:UsersPycharmProjectsxldrommain.py», line 85, in csv_to_xlsx
pd.read_csv(‘drom/xlsx/’ + filename + ‘.csv’, sep=»,», encoding=»cp1251″).to_excel(‘drom/xlsx/’ + filename + ‘.xlsx’, index=None)
File «C:UsersPycharmProjectsxlvenvlibsite-packagespandasutil_decorators.py», line 311, in wrapper
return func(*args, **kwargs)
File «C:UsersPycharmProjectsxlvenvlibsite-packagespandasioparsersreaders.py», line 680, in read_csv
return _read(filepath_or_buffer, kwds)
File «C:UsersPycharmProjectsxlvenvlibsite-packagespandasioparsersreaders.py», line 581, in _read
return parser.read(nrows)
File «C:UsersPycharmProjectsxlvenvlibsite-packagespandasioparsersreaders.py», line 1255, in read
index, columns, col_dict = self._engine.read(nrows)
File «C:UsersPycharmProjectsxlvenvlibsite-packagespandasioparsersc_parser_wrapper.py», line 225, in read
chunks = self._reader.read_low_memory(nrows)
File «pandas_libsparsers.pyx», line 805, in pandas._libs.parsers.TextReader.read_low_memory
File «pandas_libsparsers.pyx», line 861, in pandas._libs.parsers.TextReader._read_rows
File «pandas_libsparsers.pyx», line 847, in pandas._libs.parsers.TextReader._tokenize_rows
File «pandas_libsparsers.pyx», line 1960, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 8 fields in line 3, saw 10

Как пофиксить?

Вопрос задан

25 июл. 2022
272 просмотра

У меня такой код работает, файл сохраняется

filename=...
pandas.read_csv('c:\work\' + filename + '.csv', sep=";", encoding="utf8").to_excel('c:\work\' + filename + '.xlsx', index=None)

Пригласить эксперта

import os
import glob
import csv
from xlsxwriter.workbook import Workbook
for csvfile in glob.glob(os.path.join('.', '*.csv')):
    workbook = Workbook(csvfile[:-4] + '.xlsx')
    worksheet = workbook.add_worksheet()
    with open(csvfile, 'rt', encoding='utf8') as f:
        reader = csv.reader(f)
        for r, row in enumerate(reader):
            for c, col in enumerate(row):
                worksheet.write(r, c, col)
    workbook.close()

Показать ещё
Загружается…

15 апр. 2023, в 14:05

80000 руб./за проект

15 апр. 2023, в 13:55

55000 руб./за проект

15 апр. 2023, в 13:45

1000 руб./за проект

Минуточку внимания

Источник

The Python programming language has many libraries that can read, write, and manipulate CSV files. Python’s built-in csv module is one such library. It can be used to read or write the contents of a CSV file or to parse it into individual strings, numbers, etc.

When it comes to converting CSV to an Excel file, we must use an external module that let us work with Excel files (xlsx). There are few such libraries to choose from.

For this article, we are going to use the xlsxwriter module.

Create and read CSV files

This example code creates a CSV file with a list of popular writers (3 male and 3 female writers).

import csv

with open(‘writers.csv’, ‘w’, newline=») as file:

writer = csv.writer(file)

writer.writerow([«#», «Name», «Book», «Gender»])

writer.writerow([1, «Agatha Christie», «Murder on the Orient Express», «Female»])

writer.writerow([2, «J. K. Rowling», «Harry Potter», «Female»])

writer.writerow([3, «J. R. R. Tolkien», «Lord of the Rings», «Male»])

writer.writerow([4, «Stephen King», «The Shining», «Male»])

writer.writerow([5, «Danielle Steel», «Invisible», «Female»])

writer.writerow([6, «William Shakespeare», «Hamlet», «Male»])

The file is written at the default file location. If you open it with a notepad, it’s going to look like this:

Read CSV

This code reads the CSV file and prints the result on the console.

import csv

file = open(«writers.csv»)

csvreader = csv.reader(file)

for row in csvreader:

print(row)

file.close()

Create Excel Sheet

Now, let’s create an Excel sheet.

import xlsxwriter

workbook = xlsxwriter.Workbook(‘writers.xlsx’)

worksheet1 = workbook.add_worksheet(‘Male’)

worksheet2 = workbook.add_worksheet(‘Female’)

workbook.close()

This code creates an Excel file called writers.xslx with two worksheets: Male and Female.

At the end of the code, there is the close function. Without it, the file won’t be created.

Convert a single CSV file to multiple sheets

In this part, we are going to read CSV and write everything into an Excel file. Let’s start from the header. There is only one CSV file, so we need to take the header and write it twice into both Excel worksheets.

for index in range(len(header)):

worksheet1.write(0, index, header[index])

worksheet2.write(0, index, header[index])

Row and column counting start from 0 therefore 0 is column A or row 1.

The index starts from the first column and takes the first element from the list, then the second column, and the second element.

Now, we must do the same with the remaining CSV elements.

row_numer_male = 0

row_numer_female = 0

for row in csvreader:

if row[3] == ‘Male’:

row_numer_male += 1

for index in range(len(header)):

worksheet1.write(row_numer_male, index, row[index])

elif row[3] == ‘Female’:

row_numer_female += 1

for index in range(len(header)):

worksheet2.write(row_numer_female, index, row[index])

The code checks each element in the fourth column of the CSV file, if it’s Male, the element is placed inside the first worksheet, if it’s Female, then into the second one.

The result for males:

And for females:

This is the full code:

import csv

import xlsxwriter

with open(‘writers.csv’, ‘w’, newline=») as file:

writer = csv.writer(file)

writer.writerow([«#», «Name», «Book», «Gender»])

writer.writerow([1, «Agatha Christie», «Murder on the Orient Express», «Female»])

writer.writerow([2, «J. K. Rowling», «Harry Potter», «Female»])

writer.writerow([3, «J. R. R. Tolkien», «Lord of the Rings», «Male»])

writer.writerow([4, «Stephen King», «The Shining», «Male»])

writer.writerow([5, «Danielle Steel», «Invisible», «Female»])

writer.writerow([6, «William Shakespeare», «Hamlet», «Male»])

file = open(«writers.csv»)

csvreader = csv.reader(file)

header = next(csvreader)

workbook = xlsxwriter.Workbook(‘writers.xlsx’)

worksheet1 = workbook.add_worksheet(‘Male’)

worksheet2 = workbook.add_worksheet(‘Female’)

for index in range(len(header)):

worksheet1.write(0, index, header[index])

worksheet2.write(0, index, header[index])

row_numer_male = 0

row_numer_female = 0

for row in csvreader:

if row[3] == ‘Male’:

row_numer_male += 1

for index in range(len(header)):

worksheet1.write(row_numer_male, index, row[index])

elif row[3] == ‘Female’:

row_numer_female += 1

for index in range(len(header)):

worksheet2.write(row_numer_female, index, row[index])

file.close()

workbook.close()

Convert multiple CSV files to Excel sheets

We can take a different approach. If we have multiple CSV files inside a directory, we can convert each of them into an Excel worksheet named after this file.

We can modify the previous code to create two CSV files, one for female and the other one for male writers:

with open(‘female_writers.csv’, ‘w’, newline=») as file:

writer = csv.writer(file)

writer.writerow([«#», «Name», «Book», «Gender»])

writer.writerow([1, «Agatha Christie», «Murder on the Orient Express», «Female»])

writer.writerow([2, «J. K. Rowling», «Harry Potter», «Female»])

writer.writerow([5, «Danielle Steel», «Invisible», «Female»])

with open(‘male_writers.csv’, ‘w’, newline=») as file:

writer = csv.writer(file)

writer.writerow([«#», «Name», «Book», «Gender»])

writer.writerow([3, «J. R. R. Tolkien», «Lord of the Rings», «Male»])

writer.writerow([4, «Stephen King», «The Shining», «Male»])

writer.writerow([6, «William Shakespeare», «Hamlet», «Male»])

Next, let’s read the CSV files.

There are a few ways we can use to get all the files with a certain extension; using the glob module is one of them.

import glob

import os

files = glob.glob(r‘C:path*csv’)

for file_path in files:

print(file)

The code above gets all the CSV files from the directory and prints them to the console.

What we need to do now, is to create an Excel file and use CSV files names as worksheet names. We also need to copy the contents of each CSV file into each sheet. The following code does just that.

import glob

import os

import csv

import xlsxwriter

files = glob.glob(r‘C:path*csv’)

workbook = xlsxwriter.Workbook(‘writers.xlsx’)

row_numer = 0

for file_path in files:

file = open(file_path)

csvreader = csv.reader(file)

file_name = os.path.basename(file_path)

file_no_ext = os.path.splitext(file_name)[0]

worksheet1 = workbook.add_worksheet(file_no_ext)

row_numer = 0

for row in csvreader:

for index in range(len(row)):

worksheet1.write(row_numer, index, row[index])

row_numer += 1

file.close()

workbook.close()

The os.path.basename function strips the full file path and assigns only a name to the file_name variable. Next, this name (with) extension is split into the file name and file extension, where the name path is assigned to file_no_ext.

Each worksheet is named using this variable.

Ezoic

Источник

Pandas is a third-party python module that can manipulate different format data files, such as CSV, JSON, Excel, Clipboard, HTML format, etc. This example will tell you how to use Pandas to read/write CSV files, and how to save the pandas.DataFrame object to an excel file.

1. How To Use Pandas In Python Application.

1.1 Install Python Pandas Module.

First, you should make sure the python pandas module has been installed using the pip show pandas command in a terminal. If it shows can not find the pandas module in the terminal, you need to run the pip install pandas command to install it.

$ pip show pandas
WARNING: Package(s) not found: pandas

$ pip install pandas
Collecting pandas
  Downloading pandas-1.2.3-cp37-cp37m-macosx_10_9_x86_64.whl (10.4 MB)
     |████████████████████████████████| 10.4 MB 135 kB/s 
Collecting pytz>=2017.3
  Downloading pytz-2021.1-py2.py3-none-any.whl (510 kB)
     |████████████████████████████████| 510 kB 295 kB/s 
Requirement already satisfied: numpy>=1.16.5 in /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages (from pandas) (1.20.1)
Requirement already satisfied: python-dateutil>=2.7.3 in /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages (from pandas) (2.8.1)
Requirement already satisfied: six>=1.5 in /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)
Installing collected packages: pytz, pandas
Successfully installed pandas-1.2.3 pytz-2021.1

Because this example will save data to an excel file with the python pandas module, so it should install the python XlsxWriter module also. Run the command pip show XlsxWriter to see whether the python XlsxWriter module has been installed or not, if not you should run the pip install XlsxWriter to install it.

$ pip show XlsxWriter
WARNING: Package(s) not found: XlsxWriter

$ pip install XlsxWriter
Collecting XlsxWriter
  Downloading XlsxWriter-1.3.7-py2.py3-none-any.whl (144 kB)
     |████████████████████████████████| 144 kB 852 kB/s 
Installing collected packages: XlsxWriter
Successfully installed XlsxWriter-1.3.7

1. 2 Import Python Pandas Module In Python Source File.

This is very simple, just add the import pandas command at the beginning of the python source file to import it, then you can use it’s various methods.

2. Read CSV File Use Pandas.

To read a CSV file using python pandas is very easy, you just need to invoke the pandas module’s read_csv method with the CSV file path. The returned object is a pandas.DataFrame object. It represents the whole data of the CSV file, you can use its various method to manipulate the data such as order, query, change index, columns, etc.
```
data_frame = pandas.read_csv(csv_file)
```
You can pass an encoding parameter to the read_csv() method to specify the CSV file text character encoding.
```
data_frame = pandas.read_csv(csv_file, encoding='gbk')
```
Now you can call the returned DataFrame object’s head(n) method to get the first n rows of the text in the CSV file.
```
data_frame.head(n)
```

3. Pandas Write Data To CSV File.

After you edit the data in the pandas.DataFrame object, you can call its to_csv method to save the new data into a CSV file.
```
data_frame.to_csv(csv_file_path)
```

4. Pandas Write Data To Excel File.

Create a file writer using pandas.ExcelWriter method.

excel_writer = pandas.ExcelWriter(excel_file_path, engine='xlsxwriter')

Call DataFrame object’s to_excel method to set the DataFrame data to a special excel file sheet.
```
data_frame.to_excel(excel_writer, 'Employee Info')
```
Call the writer’s save method to save the data to an excel file.
```
excel_writer.save()
```

5. Python Pandas DataFrame Operation Methods.

5.1 Sort DataFrame Data By One Column.

Please note the data column name is case sensitive.

data_frame.sort_values(by=['Salary'], ascending=False)

5.2 Query DataFrame Data In A Range.

The below python code will query a range of data in the DataFrame object.

data_frame = data_frame.loc[(data_frame['Salary'] > 10000) & (data_frame['Salary'] < 20000)]

6. Python Pandas Read/Write CSV File And Save To Excel File Example.

Below is the content of this example used source CSV file, the file name is employee_info.csv.

Name,Hire Date,Salary
jerry,2010-01-01,16000
tom,2011-08-19,6000
kevin,2009-02-08,13000
richard,2012-03-19,5000
jackie,2015-06-08,28000
steven,2008-02-01,36000
jack,2006-09-19,8000
gary,2018-01-16,19000
john,2017-10-01,16600

The example python file name is CSVExcelConvertionExample.py, it contains the below functions.

read_csv_file_by_pandas(csv_file).

import pandas

import os

import glob

# Read csv file use pandas module.
def read_csv_file_by_pandas(csv_file):
    data_frame = None
    if(os.path.exists(csv_file)):
        data_frame = pandas.read_csv(csv_file)
        
        print("------------------data frame all----------------------")
        print(data_frame)
        
        print("------------------data frame index----------------------")
        print(data_frame.index)
        
        data_frame = data_frame.set_index('Name')
        print("------------------set Name column as data frame index----------------------")
        print(data_frame.index)
        
        print("------------------data frame columns----------------------")
        print(data_frame.columns)
        
        print("------------------data frame values----------------------")
        print(data_frame.values)
        
        print("------------------data frame hire date series----------------------")
        print(data_frame['Hire Date'])
        
        print("------------------select multiple columns from data frame----------------------")
        print(data_frame[['Salary', 'Hire Date']])
    else:
        print(csv_file + " do not exist.")    
    return data_frame


if __name__ == '__main__':    
    
    data_frame = read_csv_file_by_pandas("./employee_info.csv")

========================================================================================================
Execution output:

------------------data frame all----------------------
      Name   Hire Date  Salary
0    jerry  2010-01-01   16000
1      tom  2011-08-19    6000
2    kevin  2009-02-08   13000
3  richard  2012-03-19    5000
4   jackie  2015-06-08   28000
5   steven  2008-02-01   36000
6     jack  2006-09-19    8000
7     gary  2018-01-16   19000
8     john  2017-10-01   16600
------------------data frame index----------------------
RangeIndex(start=0, stop=9, step=1)
------------------set Name column as data frame index----------------------
Index(['jerry', 'tom', 'kevin', 'richard', 'jackie', 'steven', 'jack', 'gary',
       'john'],
      dtype='object', name='Name')
------------------data frame columns----------------------
Index(['Hire Date', 'Salary'], dtype='object')
------------------data frame values----------------------
[['2010-01-01' 16000]
 ['2011-08-19' 6000]
 ['2009-02-08' 13000]
 ['2012-03-19' 5000]
 ['2015-06-08' 28000]
 ['2008-02-01' 36000]
 ['2006-09-19' 8000]
 ['2018-01-16' 19000]
 ['2017-10-01' 16600]]
------------------data frame hire date series----------------------
Name
jerry      2010-01-01
tom        2011-08-19
kevin      2009-02-08
richard    2012-03-19
jackie     2015-06-08
steven     2008-02-01
jack       2006-09-19
gary       2018-01-16
john       2017-10-01
Name: Hire Date, dtype: object
------------------select multiple columns from data frame----------------------
         Salary   Hire Date
Name                       
jerry     16000  2010-01-01
tom        6000  2011-08-19
kevin     13000  2009-02-08
richard    5000  2012-03-19
jackie    28000  2015-06-08
steven    36000  2008-02-01
jack       8000  2006-09-19
gary      19000  2018-01-16
john      16600  2017-10-01

write_to_csv_file_by_pandas(csv_file_path, data_frame).

import pandas

import os

import glob

# Read csv file use pandas module.
def read_csv_file_by_pandas(csv_file):
    data_frame = None
    if(os.path.exists(csv_file)):
        data_frame = pandas.read_csv(csv_file)
        
        print("------------------data frame all----------------------")
        print(data_frame)
        
        print("------------------data frame index----------------------")
        print(data_frame.index)
        
        data_frame = data_frame.set_index('Name')
        print("------------------set Name column as data frame index----------------------")
        print(data_frame.index)
        
        print("------------------data frame columns----------------------")
        print(data_frame.columns)
        
        print("------------------data frame values----------------------")
        print(data_frame.values)
        
        print("------------------data frame hire date series----------------------")
        print(data_frame['Hire Date'])
        
        print("------------------select multiple columns from data frame----------------------")
        print(data_frame[['Salary', 'Hire Date']])
    else:
        print(csv_file + " do not exist.")    
    return data_frame

# Write pandas.DataFrame object to a csv file.
def write_to_csv_file_by_pandas(csv_file_path, data_frame):
    data_frame.to_csv(csv_file_path)
    print(csv_file_path + ' has been created.')


if __name__ == '__main__':
  
    data_frame = read_csv_file_by_pandas("./employee_info.csv")
    
    write_to_csv_file_by_pandas("./employee_info_new.csv", data_frame)

================================================================================================================
Execution output:

./employee_info_new.csv has been created.

write_to_excel_file_by_pandas(excel_file_path, data_frame).

import pandas

import os

import glob

# Read csv file use pandas module.
def read_csv_file_by_pandas(csv_file):
    data_frame = None
    if(os.path.exists(csv_file)):
        data_frame = pandas.read_csv(csv_file)
        
        print("------------------data frame all----------------------")
        print(data_frame)
        
        print("------------------data frame index----------------------")
        print(data_frame.index)
        
        data_frame = data_frame.set_index('Name')
        print("------------------set Name column as data frame index----------------------")
        print(data_frame.index)
        
        print("------------------data frame columns----------------------")
        print(data_frame.columns)
        
        print("------------------data frame values----------------------")
        print(data_frame.values)
        
        print("------------------data frame hire date series----------------------")
        print(data_frame['Hire Date'])
        
        print("------------------select multiple columns from data frame----------------------")
        print(data_frame[['Salary', 'Hire Date']])
    else:
        print(csv_file + " do not exist.")    
    return data_frame


# Write pandas.DataFrame object to an excel file.
def write_to_excel_file_by_pandas(excel_file_path, data_frame):
    excel_writer = pandas.ExcelWriter(excel_file_path, engine='xlsxwriter')
    data_frame.to_excel(excel_writer, 'Employee Info')
    excel_writer.save()
    print(excel_file_path + ' has been created.')

if __name__ == '__main__':
    
    data_frame = read_csv_file_by_pandas("./employee_info.csv")
    
    write_to_excel_file_by_pandas("./employee_info_new.xlsx", data_frame)

==========================================================================================
Execution output:

./employee_info_new.xlsx has been created.

sort_data_frame_by_string_column(data_frame).

import pandas

import os

import glob

# Read csv file use pandas module.
def read_csv_file_by_pandas(csv_file):
    data_frame = None
    if(os.path.exists(csv_file)):
        data_frame = pandas.read_csv(csv_file)
        
        print("------------------data frame all----------------------")
        print(data_frame)
        
        print("------------------data frame index----------------------")
        print(data_frame.index)
        
        data_frame = data_frame.set_index('Name')
        print("------------------set Name column as data frame index----------------------")
        print(data_frame.index)
        
        print("------------------data frame columns----------------------")
        print(data_frame.columns)
        
        print("------------------data frame values----------------------")
        print(data_frame.values)
        
        print("------------------data frame hire date series----------------------")
        print(data_frame['Hire Date'])
        
        print("------------------select multiple columns from data frame----------------------")
        print(data_frame[['Salary', 'Hire Date']])
    else:
        print(csv_file + " do not exist.")    
    return data_frame


# Sort the data in DataFrame object by name that data type is string.
def sort_data_frame_by_string_column(data_frame):
    data_frame = data_frame.sort_values(by=['Name'])
    print("--------------Sort data format by string column---------------")
    print(data_frame)


if __name__ == '__main__':

    data_frame = read_csv_file_by_pandas("./employee_info.csv")
    
    sort_data_frame_by_string_column(data_frame)

====================================================================================================
Execution output:

--------------Sort data format by string column---------------
          Hire Date  Salary
Name                       
gary     2018-01-16   19000
jack     2006-09-19    8000
jackie   2015-06-08   28000
jerry    2010-01-01   16000
john     2017-10-01   16600
kevin    2009-02-08   13000
richard  2012-03-19    5000
steven   2008-02-01   36000
tom      2011-08-19    6000

sort_data_frame_by_datetime_column(data_frame).

import pandas

import os

import glob

# Read csv file use pandas module.
def read_csv_file_by_pandas(csv_file):
    data_frame = None
    if(os.path.exists(csv_file)):
        data_frame = pandas.read_csv(csv_file)
        
        print("------------------data frame all----------------------")
        print(data_frame)
        
        print("------------------data frame index----------------------")
        print(data_frame.index)
        
        data_frame = data_frame.set_index('Name')
        print("------------------set Name column as data frame index----------------------")
        print(data_frame.index)
        
        print("------------------data frame columns----------------------")
        print(data_frame.columns)
        
        print("------------------data frame values----------------------")
        print(data_frame.values)
        
        print("------------------data frame hire date series----------------------")
        print(data_frame['Hire Date'])
        
        print("------------------select multiple columns from data frame----------------------")
        print(data_frame[['Salary', 'Hire Date']])
    else:
        print(csv_file + " do not exist.")    
    return data_frame


# Sort DataFrame data by Hire Date that data type is datetime. 
def sort_data_frame_by_datetime_column(data_frame):
    data_frame = data_frame.sort_values(by=['Hire Date'])
    print("--------------Sort data format by date column---------------")
    print(data_frame)


if __name__ == '__main__':
    
    data_frame = read_csv_file_by_pandas("./employee_info.csv")
   
    sort_data_frame_by_datetime_column(data_frame)
===========================================================================================

Execution output:

--------------Sort data format by date column---------------
          Hire Date  Salary
Name                       
jack     2006-09-19    8000
steven   2008-02-01   36000
kevin    2009-02-08   13000
jerry    2010-01-01   16000
tom      2011-08-19    6000
richard  2012-03-19    5000
jackie   2015-06-08   28000
john     2017-10-01   16600
gary     2018-01-16   19000

sort_data_frame_by_number_column(data_frame).

import pandas

import os

import glob

# Read csv file use pandas module.
def read_csv_file_by_pandas(csv_file):
    data_frame = None
    if(os.path.exists(csv_file)):
        data_frame = pandas.read_csv(csv_file)
        
        print("------------------data frame all----------------------")
        print(data_frame)
        
        print("------------------data frame index----------------------")
        print(data_frame.index)
        
        data_frame = data_frame.set_index('Name')
        print("------------------set Name column as data frame index----------------------")
        print(data_frame.index)
        
        print("------------------data frame columns----------------------")
        print(data_frame.columns)
        
        print("------------------data frame values----------------------")
        print(data_frame.values)
        
        print("------------------data frame hire date series----------------------")
        print(data_frame['Hire Date'])
        
        print("------------------select multiple columns from data frame----------------------")
        print(data_frame[['Salary', 'Hire Date']])
    else:
        print(csv_file + " do not exist.")    
    return data_frame

    
# Sort DataFrame data by Salary that data type is number.    
def sort_data_frame_by_number_column(data_frame):
    data_frame = data_frame.sort_values(by=['Salary'], ascending=False)
    print("--------------Sort data format by number column desc---------------")
    print(data_frame) 


if __name__ == '__main__':
    data_frame = read_csv_file_by_pandas("./employee_info.csv")

    sort_data_frame_by_number_column(data_frame)

================================================================================

Execution output:

--------------Sort data format by number column desc---------------
          Hire Date  Salary
Name                       
steven   2008-02-01   36000
jackie   2015-06-08   28000
gary     2018-01-16   19000
john     2017-10-01   16600
jerry    2010-01-01   16000
kevin    2009-02-08   13000
jack     2006-09-19    8000
tom      2011-08-19    6000
richard  2012-03-19    5000

get_data_in_salary_range(data_frame).

import pandas

import os

import glob

# Read csv file use pandas module.
def read_csv_file_by_pandas(csv_file):
    data_frame = None
    if(os.path.exists(csv_file)):
        data_frame = pandas.read_csv(csv_file)
        
        print("------------------data frame all----------------------")
        print(data_frame)
        
        print("------------------data frame index----------------------")
        print(data_frame.index)
        
        data_frame = data_frame.set_index('Name')
        print("------------------set Name column as data frame index----------------------")
        print(data_frame.index)
        
        print("------------------data frame columns----------------------")
        print(data_frame.columns)
        
        print("------------------data frame values----------------------")
        print(data_frame.values)
        
        print("------------------data frame hire date series----------------------")
        print(data_frame['Hire Date'])
        
        print("------------------select multiple columns from data frame----------------------")
        print(data_frame[['Salary', 'Hire Date']])
    else:
        print(csv_file + " do not exist.")    
    return data_frame



# Get DataFrame data list in salary range.    
def get_data_in_salary_range(data_frame):
    data_frame = data_frame.loc[(data_frame['Salary'] > 10000) & (data_frame['Salary'] < 20000)]
    data_frame = data_frame.sort_values(by=['Salary'])
    
    print("-------------- Employee info whose salary between 10000 and 20000---------------")
    print(data_frame) 


if __name__ == '__main__':
    
    data_frame = read_csv_file_by_pandas("./employee_info.csv")
    
    get_data_in_salary_range(data_frame)

==============================================================================================================

Execution output:

-------------- Employee info whose salary between 10000 and 20000---------------
        Hire Date  Salary
Name                     
kevin  2009-02-08   13000
jerry  2010-01-01   16000
john   2017-10-01   16600
gary   2018-01-16   19000

get_data_in_hire_date_range(data_frame).

import pandas

import os

import glob

# Read csv file use pandas module.
def read_csv_file_by_pandas(csv_file):
    data_frame = None
    if(os.path.exists(csv_file)):
        data_frame = pandas.read_csv(csv_file)
        
        print("------------------data frame all----------------------")
        print(data_frame)
        
        print("------------------data frame index----------------------")
        print(data_frame.index)
        
        data_frame = data_frame.set_index('Name')
        print("------------------set Name column as data frame index----------------------")
        print(data_frame.index)
        
        print("------------------data frame columns----------------------")
        print(data_frame.columns)
        
        print("------------------data frame values----------------------")
        print(data_frame.values)
        
        print("------------------data frame hire date series----------------------")
        print(data_frame['Hire Date'])
        
        print("------------------select multiple columns from data frame----------------------")
        print(data_frame[['Salary', 'Hire Date']])
    else:
        print(csv_file + " do not exist.")    
    return data_frame

 
# Get DataFrame data list in hire date range.       
def get_data_in_hire_date_range(data_frame):
    min_hire_date = '2010-01-01'
    max_hire_date = '2017-01-01'
    data_frame = data_frame.loc[(data_frame['Hire Date'] > min_hire_date) & (data_frame['Hire Date'] < max_hire_date)]
    data_frame = data_frame.sort_values(by=['Hire Date'])
    print("-------------- Employee info whose Hire Date between 2010/01/01 and 2017/01/01---------------")
    print(data_frame)


if __name__ == '__main__':
    
    data_frame = read_csv_file_by_pandas("./employee_info.csv")
    
    get_data_in_hire_date_range(data_frame)

====================================================================================================

Execution output:

-------------- Employee info whose Hire Date between 2010/01/01 and 2017/01/01---------------
          Hire Date  Salary
Name                       
tom      2011-08-19    6000
richard  2012-03-19    5000
jackie   2015-06-08   28000

get_data_in_name_range(data_frame).

import pandas

import os

import glob

# Read csv file use pandas module.
def read_csv_file_by_pandas(csv_file):
    data_frame = None
    if(os.path.exists(csv_file)):
        data_frame = pandas.read_csv(csv_file)
        
        print("------------------data frame all----------------------")
        print(data_frame)
        
        print("------------------data frame index----------------------")
        print(data_frame.index)
        
        data_frame = data_frame.set_index('Name')
        print("------------------set Name column as data frame index----------------------")
        print(data_frame.index)
        
        print("------------------data frame columns----------------------")
        print(data_frame.columns)
        
        print("------------------data frame values----------------------")
        print(data_frame.values)
        
        print("------------------data frame hire date series----------------------")
        print(data_frame['Hire Date'])
        
        print("------------------select multiple columns from data frame----------------------")
        print(data_frame[['Salary', 'Hire Date']])
    else:
        print(csv_file + " do not exist.")    
    return data_frame

  
# Get DataFrame data list in name range.       
def get_data_in_name_range(data_frame):
    start_name = 'jerry'
    end_name = 'kevin'
    # First sort the data in the data_frame by Name column.
    data_frame = data_frame.sort_values(by=['Name'])
    # Because the Name column is the index column, so use the value in loc directly. 
    data_frame = data_frame.loc[start_name:end_name]
    print("-------------- Employee info whose Name first character between jerry and kevin---------------")
    print(data_frame)


if __name__ == '__main__':
    
    data_frame = read_csv_file_by_pandas("./employee_info.csv")
    
    get_data_in_name_range(data_frame)

====================================================================================================

Execution output:

-------------- Employee info whose Name first character between jerry and kevin---------------
        Hire Date  Salary
Name                     
jerry  2010-01-01   16000
john   2017-10-01   16600
kevin  2009-02-08   13000

convert_csv_to_excel_in_folder(folder_path): You can see the below section 7 to see this function detail python source code.

7. How To Convert Multiple CSV Files In A Folder To Excel File.

The comment-95168 wants to convert some CSV files in a directory to Excel files automatically. Below is the example code which can implement this function.

import pandas

import os

import glob

'''
This function will convert all the CSV files in the folder_path to Excel files.
'''
def convert_csv_to_excel_in_folder(folder_path):
     
    # Loop all the CSV files under the path.
    for csv_file in glob.glob(os.path.join(folder_path, '*.csv')):

        # If the CSV file exist.
        if(os.path.exists(csv_file)):

            # Get the target excel file name and path.
            excel_file_path = csv_file.replace(".csv", ".xlsx")

            # Read the CSV file by python pandas module read_csv method. 
            data_frame = pandas.read_csv(csv_file)

            # Create an excel writer object.
            excel_writer = pandas.ExcelWriter(excel_file_path, engine='xlsxwriter')

            # Add a work sheet in the target excel file.
            data_frame.to_excel(excel_writer, 'sheet_1')

            # Save the target excel file.
            excel_writer.save()

            print(excel_file_path + ' has been created.')

        else:
            print(csv_file + " do not exist.")

if __name__ == '__main__':
    
    convert_csv_to_excel_in_folder(".")

====================================================================================================

Execution output:

./employee_info_new.xlsx has been created.
./employee_info.xlsx has been created.

Источник

Python3

Method 2:

Python3

Getting Started

Converting an Excel Spreadsheet to CSV

Wrapping Up

Related Reading

Step 1: Install the Pandas package

Step 2: Capture the path where the CSV file is stored

Step 3: Specify the path where the new Excel file will be stored

Step 4: Convert the CSV to Excel using Python

Problem Formulation

Method 1: 5 Easy Steps in Pandas

Method 2: Modules csv and openpyxl

More Python CSV Conversions

Where to Go From Here?

Минуточку внимания

Create and read CSV files

Read CSV

Create Excel Sheet

Convert a single CSV file to multiple sheets

Convert multiple CSV files to Excel sheets

1. How To Use Pandas In Python Application.

1.1 Install Python Pandas Module.

1. 2 Import Python Pandas Module In Python Source File.

2. Read CSV File Use Pandas.

3. Pandas Write Data To CSV File.

4. Pandas Write Data To Excel File.

5. Python Pandas DataFrame Operation Methods.

5.1 Sort DataFrame Data By One Column.

5.2 Query DataFrame Data In A Range.

6. Python Pandas Read/Write CSV File And Save To Excel File Example.

7. How To Convert Multiple CSV Files In A Folder To Excel File.

А вот еще интересные статьи: