Как csv конвертировать в excel python

Improve Article

Save Article

Like Article

  • Read
  • Discuss
  • Improve Article

    Save Article

    Like Article

    Pandas can read, filter, and re-arrange small and large datasets and output them in a range of formats including Excel. In this article, we will be dealing with the conversion of .csv file into excel (.xlsx). 
    Pandas provide the ExcelWriter class for writing data frame objects to excel sheets. 
    Syntax: 
     

    final = pd.ExcelWriter('GFG.xlsx')

    Example:
    Sample CSV File:
     

    python-csv-to-json

    Python3

    import pandas as pd

    df_new = pd.read_csv('Names.csv')

    GFG = pd.ExcelWriter('Names.xlsx')

    df_new.to_excel(GFG, index=False)

    GFG.save()

    Output:
     

    python-csv-to-excel

     Method 2:

    The read_* functions are used to read data to pandas, the to_* methods are used to store data. The to_excel() method stores the data as an excel file. In the example here, the sheet_name is named passengers instead of the default Sheet1. By setting index=False the row index labels are not saved in the spreadsheet.

    Python3

    import pandas as pd

    df = pd.read_csv("./weather_data.csv")

    df.to_excel("weather.xlsx", sheet_name="Testing", index=False)

    Like Article

    Save Article

    In this post there is a Python example to convert from csv to xls.

    However, my file has more than 65536 rows so xls does not work. If I name the file xlsx it doesnt make a difference. Is there a Python package to convert to xlsx?

    Community's user avatar

    asked Jul 15, 2013 at 10:21

    user670186's user avatar

    0

    Here’s an example using xlsxwriter:

    import os
    import glob
    import csv
    from xlsxwriter.workbook import Workbook
    
    
    for csvfile in glob.glob(os.path.join('.', '*.csv')):
        workbook = Workbook(csvfile[:-4] + '.xlsx')
        worksheet = workbook.add_worksheet()
        with open(csvfile, 'rt', encoding='utf8') as f:
            reader = csv.reader(f)
            for r, row in enumerate(reader):
                for c, col in enumerate(row):
                    worksheet.write(r, c, col)
        workbook.close()
    

    FYI, there is also a package called openpyxl, that can read/write Excel 2007 xlsx/xlsm files.

    user's user avatar

    user

    11.1k6 gold badges23 silver badges83 bronze badges

    answered Jul 16, 2013 at 18:51

    alecxe's user avatar

    alecxealecxe

    458k119 gold badges1069 silver badges1182 bronze badges

    14

    With my library pyexcel,

     $ pip install pyexcel pyexcel-xlsx
    

    you can do it in one command line:

    from pyexcel.cookbook import merge_all_to_a_book
    # import pyexcel.ext.xlsx # no longer required if you use pyexcel >= 0.2.2 
    import glob
    
    
    merge_all_to_a_book(glob.glob("your_csv_directory/*.csv"), "output.xlsx")
    

    Each csv will have its own sheet and the name will be their file name.

    answered Oct 19, 2014 at 23:42

    chfw's user avatar

    chfwchfw

    4,4822 gold badges27 silver badges31 bronze badges

    9

    Simple two line code solution using pandas

      import pandas as pd
    
      read_file = pd.read_csv ('File name.csv')
      read_file.to_excel ('File name.xlsx', index = None, header=True)
    

    answered Nov 16, 2019 at 23:11

    Bhanu Sinha's user avatar

    Bhanu SinhaBhanu Sinha

    1,51612 silver badges10 bronze badges

    3

    First install openpyxl:

    pip install openpyxl
    

    Then:

    from openpyxl import Workbook
    import csv
    
    
    wb = Workbook()
    ws = wb.active
    with open('test.csv', 'r') as f:
        for row in csv.reader(f):
            ws.append(row)
    wb.save('name.xlsx')
    

    Paolo's user avatar

    Paolo

    19.6k21 gold badges75 silver badges113 bronze badges

    answered Mar 9, 2017 at 19:07

    zhuhuren's user avatar

    zhuhurenzhuhuren

    3274 silver badges7 bronze badges

    1

    Adding an answer that exclusively uses the pandas library to read in a .csv file and save as a .xlsx file. This example makes use of pandas.read_csv (Link to docs) and pandas.dataframe.to_excel (Link to docs).

    The fully reproducible example uses numpy to generate random numbers only, and this can be removed if you would like to use your own .csv file.

    import pandas as pd
    import numpy as np
    
    # Creating a dataframe and saving as test.csv in current directory
    df = pd.DataFrame(np.random.randn(100000, 3), columns=list('ABC'))
    df.to_csv('test.csv', index = False)
    
    # Reading in test.csv and saving as test.xlsx
    
    df_new = pd.read_csv('test.csv')
    writer = pd.ExcelWriter('test.xlsx')
    df_new.to_excel(writer, index = False)
    writer.save()
    

    answered Dec 29, 2017 at 17:19

    patrickjlong1's user avatar

    patrickjlong1patrickjlong1

    3,6431 gold badge18 silver badges32 bronze badges

    2

    Simple 1-to-1 CSV to XLSX file conversion without enumerating/looping through the rows:

    import pyexcel
    
    sheet = pyexcel.get_sheet(file_name="myFile.csv", delimiter=",")
    sheet.save_as("myFile.xlsx")
    

    Notes:

    1. I have found that if the file_name is really long (>30 characters excluding path)
      then the resultant XLSX file will throw an error when Excel tries
      to load it. Excel will offer to fix the error which it does, but it
      is frustrating.
    2. There is a great answer previously provided that
      combines all of the CSV files in a directory into one XLSX workbook,
      which fits a different use case than just trying to do a 1-to-1 CSV file to
      XLSX file conversion.

    answered Apr 8, 2020 at 20:16

    Larry W's user avatar

    Larry WLarry W

    1011 silver badge5 bronze badges

    2

    How I do it with openpyxl lib:

    import csv
    from openpyxl import Workbook
    
    def convert_csv_to_xlsx(self):
        wb = Workbook()
        sheet = wb.active
    
        CSV_SEPARATOR = "#"
    
        with open("my_file.csv") as f:
            reader = csv.reader(f)
            for r, row in enumerate(reader):
                for c, col in enumerate(row):
                    for idx, val in enumerate(col.split(CSV_SEPARATOR)):
                        cell = sheet.cell(row=r+1, column=idx+1)
                        cell.value = val
    
        wb.save("my_file.xlsx")
    

    mcarton's user avatar

    mcarton

    26.8k5 gold badges82 silver badges92 bronze badges

    answered Aug 17, 2016 at 16:58

    Rubycon's user avatar

    RubyconRubycon

    18.1k10 gold badges49 silver badges70 bronze badges

    There is a simple way

    import os
    import csv
    import sys
    
    from openpyxl import Workbook
    
    reload(sys)
    sys.setdefaultencoding('utf8')
    
    if __name__ == '__main__':
        workbook = Workbook()
        worksheet = workbook.active
        with open('input.csv', 'r') as f:
            reader = csv.reader(f)
            for r, row in enumerate(reader):
                for c, col in enumerate(row):
                    for idx, val in enumerate(col.split(',')):
                        cell = worksheet.cell(row=r+1, column=c+1)
                        cell.value = val
        workbook.save('output.xlsx')
    

    answered May 5, 2017 at 2:23

    David Ding's user avatar

    David DingDavid Ding

    1,4331 gold badge15 silver badges13 bronze badges

    There are many common file types that you will need to work with as a software developer. One such format is the CSV file. CSV stands for “Comma-Separated Values” and is a text file format that uses a comma as a delimiter to separate values from one another. Each row is its own record and each value is its own field. Most CSV files have records that are all the same length.

    Microsoft Excel opens CSV files with no problem. You can open one yourself with Excel and then save it yourself in an Excel format. The purpose of this article is to teach you the following concepts:

    • Converting a CSV file to Excel
    • Converting an Excel spreadsheet to CSV

    You will be using Python and OpenPyXL to do the conversion from one file type to the other.

    Getting Started

    You need to install OpenPyXL to be able to use the examples in this article. You can use pip to install OpenPyXL:

    python3 -m pip install openpyxl

    Now that you have OpenPyXL, you are ready to learn how to convert a CSV file to an Excel spreadsheet!

    You will soon see that converting a CSV file to an Excel spreadsheet doesn’t take very much code. However, you do need to have a CSV file to get started. With that in mind, open up your favorite text editor (Notepad, SublimeText, or something else) and add the following:

    book_title,author,publisher,pub_date,isbn
    Python 101,Mike Driscoll, Mike Driscoll,2020,123456789
    wxPython Recipes,Mike Driscoll,Apress,2018,978-1-4842-3237-8
    Python Interviews,Mike Driscoll,Packt Publishing,2018,9781788399081
    

    Save this file as books.txt. You can also download the CSV file from this book’s GitHub code repository.

    Now that you have the CSV file, you need to create a new Python file too. Open up your Python IDE and create a new file named csv_to_excel.py. Then enter the following code:

    # csv_to_excel.py
    
    import csv
    import openpyxl
    
    
    def csv_to_excel(csv_file, excel_file):
        csv_data = []
        with open(csv_file) as file_obj:
            reader = csv.reader(file_obj)
            for row in reader:
                csv_data.append(row)
    
        workbook = openpyxl.Workbook()
        sheet = workbook.active
        for row in csv_data:
            sheet.append(row)
        workbook.save(excel_file)
    
    
    if __name__ == "__main__":
        csv_to_excel("books.csv", "books.xlsx")
    

    Your code uses Python’s csv module in addition to OpenPyXL. You create a function, csv_to_excel(), then accepts two arguments:

    • csv_file – The path to the input CSV file
    • excel_file – The path to the Excel file that you want to create

    You want to extract each row of data from the CSV. To extract the data, you create an csv.reader() object and then iterate over one row at a time. For each iteration, you append the row to csv_data. A row is a list of strings.

    The next step of the process is to create the Excel spreadsheet. To add data to your Workbook, you iterate over each row in csv_data and append() them to your Worksheet. Finally, you save the Excel spreadsheet.

    When you run this code, you will have an Excel spreadsheet that looks like this:

    CSV to Excel Spreadsheet

    CSV to Excel Spreadsheet

    You are now able to convert a CSV file to an Excel spreadsheet in less than twenty-five lines of code!

    Now you are ready to learn how to convert an Excel spreadsheet to a CSV file!

    Converting an Excel Spreadsheet to CSV

    Converting an Excel spreadsheet to a CSV file can be useful if you need other processes to consume the data. Another potential need for a CSV file is when you need to share your Excel spreadsheet with someone who doesn’t have a spreadsheet program to open it. While rare, this may happen.

    You can convert an Excel spreadsheet to a CSV file using Python. Create a new file named excel_to_csv.py and add the following code:

    # excel_to_csv.py
    
    import csv
    import openpyxl
    
    from openpyxl import load_workbook
    
    
    def excel_to_csv(excel_file, csv_file):
        workbook = load_workbook(filename=excel_file)
        sheet = workbook.active
        csv_data = []
        
        # Read data from Excel
        for value in sheet.iter_rows(values_only=True):
            csv_data.append(list(value))
    
        # Write to CSV
        with open(csv_file, 'w') as csv_file_obj:
            writer = csv.writer(csv_file_obj, delimiter=',')
            for line in csv_data:
                writer.writerow(line)
    
    
    if __name__ == "__main__":
        excel_to_csv("books.xlsx", "new_books.csv")
    

    Once again you only need the csv and openpyxl modules to do the conversion. This time, you load the Excel spreadsheet first and iterate over the Worksheet using the iter_rows method. The value you receive in each iteration of iter_tools is a list of strings. You append the list of strings to csv_data.

    The next step is to create a csv.writer(). Then you iterate over each list of strings in csv_data and call writerow() to add it to your CSV file.

    Once your code finishes, you will have a brand new CSV file!

    Wrapping Up

    Converting a CSV file to an Excel spreadsheet is easy to do with Python. It’s a useful tool that you can use to take in data from your clients or other data sources and transform it into something that you can present to your company.

    You can apply cell styling to the data as you write it to your Worksheet too. By applying cell styling, you can make your data stand out with different fonts or background row colors.

    Try this code out on your own Excel or CSV files and see what you can do.

    Related Reading

    Would you like to learn more about processing Excel spreadsheets with Python? Then check out these tutorials:

    • OpenPyXL – Working with Microsoft Excel Using Python

    • Styling Excel Cells with OpenPyXL and Python
    • Reading Excel Spreadsheets with Python and xlrd

    In this quick guide, you’ll see the complete steps to convert a CSV file to an Excel file using Python.

    To start, here is a simple template that you can use to convert a CSV to Excel using Python:

    import pandas as pd
    
    read_file = pd.read_csv (r'Path where the CSV file is storedFile name.csv')
    read_file.to_excel (r'Path to store the Excel fileFile name.xlsx', index = None, header=True)
    

    In the next section, you’ll see how to apply this template in practice.

    Step 1: Install the Pandas package

    If you haven’t already done so, install the Pandas package. You can use the following command to install the Pandas package (under Windows):

    pip install pandas
    

    Step 2: Capture the path where the CSV file is stored

    Next, capture the path where the CSV file is stored on your computer.

    Here is an example of a path where a CSV file is stored:

    C:UsersRonDesktopTestProduct_List.csv

    Where ‘Product_List‘ is the current CSV file name, and ‘csv‘ is the file extension.

    Step 3: Specify the path where the new Excel file will be stored

    Now, you’ll need to specify the path where the new Excel file will be stored. For example:

    C:UsersRonDesktopTestNew_Products.xlsx

    Where ‘New_Products‘ is the new file name, and ‘xlsx‘ is the Excel file extension.

    Step 4: Convert the CSV to Excel using Python

    For this final step, you’ll need to use the following template to perform the conversion:

    import pandas as pd
    
    read_file = pd.read_csv (r'Path where the CSV file is storedFile name.csv')
    read_file.to_excel (r'Path to store the Excel fileFile name.xlsx', index = None, header=True)
    

    Here is the complete syntax for our example (note that you’ll need to modify the paths to reflect the location where the files will be stored on your computer):

    import pandas as pd
    
    read_file = pd.read_csv (r'C:UsersRonDesktopTestProduct_List.csv')
    read_file.to_excel (r'C:UsersRonDesktopTestNew_Products.xlsx', index = None, header=True)
    

    Run the code in Python and the new Excel file (i.e., New_Products) will be saved at your specified location.

    In this post there is a Python example to convert from csv to xls.

    However, my file has more than 65536 rows so xls does not work. If I name the file xlsx it doesnt make a difference. Is there a Python package to convert to xlsx?

    Community's user avatar

    asked Jul 15, 2013 at 10:21

    user670186's user avatar

    0

    Here’s an example using xlsxwriter:

    import os
    import glob
    import csv
    from xlsxwriter.workbook import Workbook
    
    
    for csvfile in glob.glob(os.path.join('.', '*.csv')):
        workbook = Workbook(csvfile[:-4] + '.xlsx')
        worksheet = workbook.add_worksheet()
        with open(csvfile, 'rt', encoding='utf8') as f:
            reader = csv.reader(f)
            for r, row in enumerate(reader):
                for c, col in enumerate(row):
                    worksheet.write(r, c, col)
        workbook.close()
    

    FYI, there is also a package called openpyxl, that can read/write Excel 2007 xlsx/xlsm files.

    user's user avatar

    user

    11.1k6 gold badges23 silver badges83 bronze badges

    answered Jul 16, 2013 at 18:51

    alecxe's user avatar

    alecxealecxe

    458k119 gold badges1069 silver badges1182 bronze badges

    14

    With my library pyexcel,

     $ pip install pyexcel pyexcel-xlsx
    

    you can do it in one command line:

    from pyexcel.cookbook import merge_all_to_a_book
    # import pyexcel.ext.xlsx # no longer required if you use pyexcel >= 0.2.2 
    import glob
    
    
    merge_all_to_a_book(glob.glob("your_csv_directory/*.csv"), "output.xlsx")
    

    Each csv will have its own sheet and the name will be their file name.

    answered Oct 19, 2014 at 23:42

    chfw's user avatar

    chfwchfw

    4,4822 gold badges27 silver badges31 bronze badges

    9

    Simple two line code solution using pandas

      import pandas as pd
    
      read_file = pd.read_csv ('File name.csv')
      read_file.to_excel ('File name.xlsx', index = None, header=True)
    

    answered Nov 16, 2019 at 23:11

    Bhanu Sinha's user avatar

    Bhanu SinhaBhanu Sinha

    1,51612 silver badges10 bronze badges

    3

    First install openpyxl:

    pip install openpyxl
    

    Then:

    from openpyxl import Workbook
    import csv
    
    
    wb = Workbook()
    ws = wb.active
    with open('test.csv', 'r') as f:
        for row in csv.reader(f):
            ws.append(row)
    wb.save('name.xlsx')
    

    Paolo's user avatar

    Paolo

    19.6k21 gold badges75 silver badges113 bronze badges

    answered Mar 9, 2017 at 19:07

    zhuhuren's user avatar

    zhuhurenzhuhuren

    3274 silver badges7 bronze badges

    1

    Adding an answer that exclusively uses the pandas library to read in a .csv file and save as a .xlsx file. This example makes use of pandas.read_csv (Link to docs) and pandas.dataframe.to_excel (Link to docs).

    The fully reproducible example uses numpy to generate random numbers only, and this can be removed if you would like to use your own .csv file.

    import pandas as pd
    import numpy as np
    
    # Creating a dataframe and saving as test.csv in current directory
    df = pd.DataFrame(np.random.randn(100000, 3), columns=list('ABC'))
    df.to_csv('test.csv', index = False)
    
    # Reading in test.csv and saving as test.xlsx
    
    df_new = pd.read_csv('test.csv')
    writer = pd.ExcelWriter('test.xlsx')
    df_new.to_excel(writer, index = False)
    writer.save()
    

    answered Dec 29, 2017 at 17:19

    patrickjlong1's user avatar

    patrickjlong1patrickjlong1

    3,6431 gold badge18 silver badges32 bronze badges

    2

    Simple 1-to-1 CSV to XLSX file conversion without enumerating/looping through the rows:

    import pyexcel
    
    sheet = pyexcel.get_sheet(file_name="myFile.csv", delimiter=",")
    sheet.save_as("myFile.xlsx")
    

    Notes:

    1. I have found that if the file_name is really long (>30 characters excluding path)
      then the resultant XLSX file will throw an error when Excel tries
      to load it. Excel will offer to fix the error which it does, but it
      is frustrating.
    2. There is a great answer previously provided that
      combines all of the CSV files in a directory into one XLSX workbook,
      which fits a different use case than just trying to do a 1-to-1 CSV file to
      XLSX file conversion.

    answered Apr 8, 2020 at 20:16

    Larry W's user avatar

    Larry WLarry W

    1011 silver badge5 bronze badges

    2

    How I do it with openpyxl lib:

    import csv
    from openpyxl import Workbook
    
    def convert_csv_to_xlsx(self):
        wb = Workbook()
        sheet = wb.active
    
        CSV_SEPARATOR = "#"
    
        with open("my_file.csv") as f:
            reader = csv.reader(f)
            for r, row in enumerate(reader):
                for c, col in enumerate(row):
                    for idx, val in enumerate(col.split(CSV_SEPARATOR)):
                        cell = sheet.cell(row=r+1, column=idx+1)
                        cell.value = val
    
        wb.save("my_file.xlsx")
    

    mcarton's user avatar

    mcarton

    26.8k5 gold badges82 silver badges92 bronze badges

    answered Aug 17, 2016 at 16:58

    Rubycon's user avatar

    RubyconRubycon

    18.1k10 gold badges49 silver badges70 bronze badges

    There is a simple way

    import os
    import csv
    import sys
    
    from openpyxl import Workbook
    
    reload(sys)
    sys.setdefaultencoding('utf8')
    
    if __name__ == '__main__':
        workbook = Workbook()
        worksheet = workbook.active
        with open('input.csv', 'r') as f:
            reader = csv.reader(f)
            for r, row in enumerate(reader):
                for c, col in enumerate(row):
                    for idx, val in enumerate(col.split(',')):
                        cell = worksheet.cell(row=r+1, column=c+1)
                        cell.value = val
        workbook.save('output.xlsx')
    

    answered May 5, 2017 at 2:23

    David Ding's user avatar

    David DingDavid Ding

    1,4331 gold badge15 silver badges13 bronze badges

    Problem Formulation

    💡 Challenge: Given a CSV file. How to convert it to an excel file in Python?

    csv to excel in Python

    We create a folder with two files, the file csv_to_excel.py and my_file.csv. We want to convert the CSV file to an excel file so that after running the script csv_to_excel.py, we obtain the third file my_file.csv in our folder like so:

    All methods discussed in this tutorial show different code snippets to put into csv_to_excel.py so that it converts the CSV to XLSX in Python.

    Method 1: 5 Easy Steps in Pandas

    The most pythonic way to convert a .csv to an .xlsx (Excel) in Python is to use the Pandas library.

    1. Install the pandas library with pip install pandas
    2. Install the openpyxl library that is used internally by pandas with pip install openpyxl
    3. Import the pandas libray with import pandas as pd
    4. Read the CSV file into a DataFrame df by using the expression df = pd.read_csv('my_file.csv')
    5. Store the DataFrame in an Excel file by calling df.to_excel('my_file.xlsx', index=None, header=True)
    import pandas as pd
    
    
    df = pd.read_csv('my_file.csv')
    df.to_excel('my_file.xlsx', index=None, header=True)
    

    Note that there are many ways to customize the to_excel() function in case

    • you don’t need a header line,
    • you want to fix the first line in the Excel file,
    • you want to format the cells as numbers instead of strings, or
    • you have an index column in the original CSV and want to consider it in the Excel file too.

    If you want to do any of those, feel free to read our full guide on the Finxter blog here:

    🌍 Tutorial: Pandas DataFrame.to_excel() – An Unofficial Guide to Saving Data to Excel

    Also, we’ve recorded a video on the ins and outs of this method here:

    pd.to_excel() – An Unofficial Guide to Saving Data to Excel

    Let’s have a look at an alternative to converting a CSV to an Excel file in Python:

    Method 2: Modules csv and openpyxl

    To convert a CSV to an Excel file, you can also use the following approach:

    • Import the csv module
    • Import the openpyxl module
    • Read the CSV file into a list of lists, one inner list per row, by using the csv.reader() function
    • Write the list of lists to the Excel file by using the workbook representation of the openpyxl library.
    • Get the active worksheet by calling workbook.active
    • Write to the worksheet by calling worksheet.append(row) and append one list of values, one value per cell.

    The following function converts a given CSV to an Excel file:

    import csv
    import openpyxl
    
    
    def csv_to_excel(csv_filename, excel_filename):
    
        # Read CSV file
        csv_data = []
        with open(csv_filename) as f:
            csv_data = [row for row in csv.reader(f)]
        
        # Write to Excel file
        workbook = openpyxl.workbook.Workbook()
        worksheet = workbook.active
        for row in csv_data:
            worksheet.append(row)
        workbook.save(excel_filename)
    
    
    if __name__ == "__main__":
        csv_to_excel("my_file.csv", "my_file.xlsx")

    This is a bit more fine-granular approach and it allows you to modify each row in the code or even write additional details into the Excel worksheet.

    More Python CSV Conversions

    🐍 Learn More: I have compiled an “ultimate guide” on the Finxter blog that shows you the best method, respectively, to convert a CSV file to JSON, Excel, dictionary, Parquet, list, list of lists, list of tuples, text file, DataFrame, XML, NumPy array, and list of dictionaries.

    Where to Go From Here?

    Enough theory. Let’s get some practice!

    Coders get paid six figures and more because they can solve problems more effectively using machine intelligence and automation.

    To become more successful in coding, solve more real problems for real people. That’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

    You build high-value coding skills by working on practical coding projects!

    Do you want to stop learning with toy projects and focus on practical code projects that earn you money and solve real problems for people?

    🚀 If your answer is YES!, consider becoming a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

    If you just want to learn about the freelancing opportunity, feel free to watch my free webinar “How to Build Your High-Income Skill Python” and learn how I grew my coding business online and how you can, too—from the comfort of your own home.

    Join the free webinar now!

    While working as a researcher in distributed systems, Dr. Christian Mayer found his love for teaching computer science students.

    To help students reach higher levels of Python success, he founded the programming education website Finxter.com that has taught exponential skills to millions of coders worldwide. He’s the author of the best-selling programming books Python One-Liners (NoStarch 2020), The Art of Clean Code (NoStarch 2022), and The Book of Dash (NoStarch 2022). Chris also coauthored the Coffee Break Python series of self-published books. He’s a computer science enthusiast, freelancer, and owner of one of the top 10 largest Python blogs worldwide.

    His passions are writing, reading, and coding. But his greatest passion is to serve aspiring coders through Finxter and help them to boost their skills. You can join his free email academy here.

    Есть код:

    def csv_to_xlsx(filename):
        import pandas as pd
        import os
        pd.read_csv('xlsx/' + filename + '.csv', sep=",", encoding="cp1251").to_excel('xlsx/' + filename + '.xlsx', index=None)
        os.remove('xlsx/' + filename + '.csv')

    Выдаётся ошибка:

    Traceback (most recent call last):
    File «C:UsersPycharmProjectsxldrommain.py», line 89, in
    get_mail()
    File «C:UsersPycharmProjectsxldrommain.py», line 46, in get_mail
    mails = get_file(‘febest’, mails, part, ms[c])
    File «C:UsersPycharmProjectsxldrommain.py», line 68, in get_file
    csv_to_xlsx(company + str(c))
    File «C:UsersPycharmProjectsxldrommain.py», line 85, in csv_to_xlsx
    pd.read_csv(‘drom/xlsx/’ + filename + ‘.csv’, sep=»,», encoding=»cp1251″).to_excel(‘drom/xlsx/’ + filename + ‘.xlsx’, index=None)
    File «C:UsersPycharmProjectsxlvenvlibsite-packagespandasutil_decorators.py», line 311, in wrapper
    return func(*args, **kwargs)
    File «C:UsersPycharmProjectsxlvenvlibsite-packagespandasioparsersreaders.py», line 680, in read_csv
    return _read(filepath_or_buffer, kwds)
    File «C:UsersPycharmProjectsxlvenvlibsite-packagespandasioparsersreaders.py», line 581, in _read
    return parser.read(nrows)
    File «C:UsersPycharmProjectsxlvenvlibsite-packagespandasioparsersreaders.py», line 1255, in read
    index, columns, col_dict = self._engine.read(nrows)
    File «C:UsersPycharmProjectsxlvenvlibsite-packagespandasioparsersc_parser_wrapper.py», line 225, in read
    chunks = self._reader.read_low_memory(nrows)
    File «pandas_libsparsers.pyx», line 805, in pandas._libs.parsers.TextReader.read_low_memory
    File «pandas_libsparsers.pyx», line 861, in pandas._libs.parsers.TextReader._read_rows
    File «pandas_libsparsers.pyx», line 847, in pandas._libs.parsers.TextReader._tokenize_rows
    File «pandas_libsparsers.pyx», line 1960, in pandas._libs.parsers.raise_parser_error
    pandas.errors.ParserError: Error tokenizing data. C error: Expected 8 fields in line 3, saw 10

    Как пофиксить?


    • Вопрос задан

      25 июл. 2022

    • 272 просмотра

    У меня такой код работает, файл сохраняется

    filename=...
    pandas.read_csv('c:\work\' + filename + '.csv', sep=";", encoding="utf8").to_excel('c:\work\' + filename + '.xlsx', index=None)

    Пригласить эксперта

    import os
    import glob
    import csv
    from xlsxwriter.workbook import Workbook
    for csvfile in glob.glob(os.path.join('.', '*.csv')):
        workbook = Workbook(csvfile[:-4] + '.xlsx')
        worksheet = workbook.add_worksheet()
        with open(csvfile, 'rt', encoding='utf8') as f:
            reader = csv.reader(f)
            for r, row in enumerate(reader):
                for c, col in enumerate(row):
                    worksheet.write(r, c, col)
        workbook.close()


    • Показать ещё
      Загружается…

    15 апр. 2023, в 14:05

    80000 руб./за проект

    15 апр. 2023, в 13:55

    55000 руб./за проект

    15 апр. 2023, в 13:45

    1000 руб./за проект

    Минуточку внимания

    The Python programming language has many libraries that can read, write, and manipulate CSV files. Python’s built-in csv module is one such library. It can be used to read or write the contents of a CSV file or to parse it into individual strings, numbers, etc.

    When it comes to converting CSV to an Excel file, we must use an external module that let us work with Excel files (xlsx). There are few such libraries to choose from.

    For this article, we are going to use the xlsxwriter module.

    Create and read CSV files

    This example code creates a CSV file with a list of popular writers (3 male and 3 female writers).

    import csv

    with open(‘writers.csv’, ‘w’, newline=») as file:

        writer = csv.writer(file)

        writer.writerow([«#», «Name», «Book», «Gender»])

        writer.writerow([1, «Agatha Christie», «Murder on the Orient Express», «Female»])

        writer.writerow([2, «J. K. Rowling», «Harry Potter», «Female»])

        writer.writerow([3, «J. R. R. Tolkien», «Lord of the Rings», «Male»])

        writer.writerow([4, «Stephen King», «The Shining», «Male»])

        writer.writerow([5, «Danielle Steel», «Invisible», «Female»])

        writer.writerow([6, «William Shakespeare», «Hamlet», «Male»])

    The file is written at the default file location. If you open it with a notepad, it’s going to look like this:

    Read CSV

    This code reads the CSV file and prints the result on the console.

    import csv

    file = open(«writers.csv»)

    csvreader = csv.reader(file)

    for row in csvreader:

        print(row)

    file.close()

    Create Excel Sheet

    Now, let’s create an Excel sheet.

    import xlsxwriter

    workbook = xlsxwriter.Workbook(‘writers.xlsx’)

    worksheet1 = workbook.add_worksheet(‘Male’)

    worksheet2 = workbook.add_worksheet(‘Female’)

    workbook.close()

    This code creates an Excel file called writers.xslx with two worksheets: Male and Female.

    At the end of the code, there is the close function. Without it, the file won’t be created.

    Convert a single CSV file to multiple sheets

    In this part, we are going to read CSV and write everything into an Excel file. Let’s start from the header. There is only one CSV file, so we need to take the header and write it twice into both Excel worksheets.

    for index in range(len(header)):

        worksheet1.write(0, index, header[index])

        worksheet2.write(0, index, header[index])

    Row and column counting start from 0 therefore 0 is column A or row 1.

    The index starts from the first column and takes the first element from the list, then the second column, and the second element.

    Now, we must do the same with the remaining CSV elements.

    row_numer_male = 0

    row_numer_female = 0

    for row in csvreader:

        if row[3] == ‘Male’:

            row_numer_male += 1

            for index in range(len(header)):

                worksheet1.write(row_numer_male, index, row[index])

        elif row[3] == ‘Female’:

            row_numer_female += 1

            for index in range(len(header)):

                worksheet2.write(row_numer_female, index, row[index])

    The code checks each element in the fourth column of the CSV file, if it’s Male, the element is placed inside the first worksheet, if it’s Female, then into the second one.

    The result for males:

    And for females:

    This is the full code:

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    import csv

    import xlsxwriter

    with open(‘writers.csv’, ‘w’, newline=») as file:

        writer = csv.writer(file)

        writer.writerow([«#», «Name», «Book», «Gender»])

        writer.writerow([1, «Agatha Christie», «Murder on the Orient Express», «Female»])

        writer.writerow([2, «J. K. Rowling», «Harry Potter», «Female»])

        writer.writerow([3, «J. R. R. Tolkien», «Lord of the Rings», «Male»])

        writer.writerow([4, «Stephen King», «The Shining», «Male»])

        writer.writerow([5, «Danielle Steel», «Invisible», «Female»])

        writer.writerow([6, «William Shakespeare», «Hamlet», «Male»])

    file = open(«writers.csv»)

    csvreader = csv.reader(file)

    header = next(csvreader)

    workbook = xlsxwriter.Workbook(‘writers.xlsx’)

    worksheet1 = workbook.add_worksheet(‘Male’)

    worksheet2 = workbook.add_worksheet(‘Female’)

    for index in range(len(header)):

        worksheet1.write(0, index, header[index])

        worksheet2.write(0, index, header[index])

    row_numer_male = 0

    row_numer_female = 0

    for row in csvreader:

        if row[3] == ‘Male’:

            row_numer_male += 1

            for index in range(len(header)):

                worksheet1.write(row_numer_male, index, row[index])

        elif row[3] == ‘Female’:

            row_numer_female += 1

            for index in range(len(header)):

                worksheet2.write(row_numer_female, index, row[index])

    file.close()

    workbook.close()

    Convert multiple CSV files to Excel sheets

    We can take a different approach. If we have multiple CSV files inside a directory, we can convert each of them into an Excel worksheet named after this file.

    We can modify the previous code to create two CSV files, one for female and the other one for male writers:

    with open(‘female_writers.csv’, ‘w’, newline=») as file:

        writer = csv.writer(file)

        writer.writerow([«#», «Name», «Book», «Gender»])

        writer.writerow([1, «Agatha Christie», «Murder on the Orient Express», «Female»])

        writer.writerow([2, «J. K. Rowling», «Harry Potter», «Female»])

        writer.writerow([5, «Danielle Steel», «Invisible», «Female»])

    with open(‘male_writers.csv’, ‘w’, newline=») as file:

        writer = csv.writer(file)

        writer.writerow([«#», «Name», «Book», «Gender»])

        writer.writerow([3, «J. R. R. Tolkien», «Lord of the Rings», «Male»])

        writer.writerow([4, «Stephen King», «The Shining», «Male»])

        writer.writerow([6, «William Shakespeare», «Hamlet», «Male»])

    Next, let’s read the CSV files.

    There are a few ways we can use to get all the files with a certain extension; using the glob module is one of them.

    import glob

    import os

    files = glob.glob(r‘C:path*csv’)

    for file_path in files:

        print(file)

    The code above gets all the CSV files from the directory and prints them to the console.

    What we need to do now, is to create an Excel file and use CSV files names as worksheet names. We also need to copy the contents of each CSV file into each sheet. The following code does just that.

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    import glob

    import os

    import csv

    import xlsxwriter

    files = glob.glob(r‘C:path*csv’)

    workbook = xlsxwriter.Workbook(‘writers.xlsx’)

    row_numer = 0

    for file_path in files:

        file = open(file_path)

        csvreader = csv.reader(file)

        file_name = os.path.basename(file_path)

        file_no_ext = os.path.splitext(file_name)[0]

        worksheet1 = workbook.add_worksheet(file_no_ext)

        row_numer = 0

        for row in csvreader:

            for index in range(len(row)):

                worksheet1.write(row_numer, index, row[index])

            row_numer += 1

        file.close()

    workbook.close()

    The os.path.basename function strips the full file path and assigns only a name to the file_name variable. Next, this name (with) extension is split into the file name and file extension, where the name path is assigned to file_no_ext.

    Each worksheet is named using this variable.

    Ezoic

    Pandas is a third-party python module that can manipulate different format data files, such as CSV, JSON, Excel, Clipboard, HTML format, etc. This example will tell you how to use Pandas to read/write CSV files, and how to save the pandas.DataFrame object to an excel file.

    1. How To Use Pandas In Python Application.

    1.1 Install Python Pandas Module.

    1. First, you should make sure the python pandas module has been installed using the pip show pandas command in a terminal. If it shows can not find the pandas module in the terminal, you need to run the pip install pandas command to install it.
      $ pip show pandas
      WARNING: Package(s) not found: pandas
      
      $ pip install pandas
      Collecting pandas
        Downloading pandas-1.2.3-cp37-cp37m-macosx_10_9_x86_64.whl (10.4 MB)
           |████████████████████████████████| 10.4 MB 135 kB/s 
      Collecting pytz>=2017.3
        Downloading pytz-2021.1-py2.py3-none-any.whl (510 kB)
           |████████████████████████████████| 510 kB 295 kB/s 
      Requirement already satisfied: numpy>=1.16.5 in /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages (from pandas) (1.20.1)
      Requirement already satisfied: python-dateutil>=2.7.3 in /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages (from pandas) (2.8.1)
      Requirement already satisfied: six>=1.5 in /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)
      Installing collected packages: pytz, pandas
      Successfully installed pandas-1.2.3 pytz-2021.1
    2. Because this example will save data to an excel file with the python pandas module, so it should install the python XlsxWriter module also. Run the command pip show XlsxWriter to see whether the python XlsxWriter module has been installed or not, if not you should run the pip install XlsxWriter to install it.
      $ pip show XlsxWriter
      WARNING: Package(s) not found: XlsxWriter
      
      $ pip install XlsxWriter
      Collecting XlsxWriter
        Downloading XlsxWriter-1.3.7-py2.py3-none-any.whl (144 kB)
           |████████████████████████████████| 144 kB 852 kB/s 
      Installing collected packages: XlsxWriter
      Successfully installed XlsxWriter-1.3.7

    1. 2 Import Python Pandas Module In Python Source File.

    1. This is very simple, just add the import pandas command at the beginning of the python source file to import it, then you can use it’s various methods.

    2. Read CSV File Use Pandas.

    1. To read a CSV file using python pandas is very easy, you just need to invoke the pandas module’s read_csv method with the CSV file path. The returned object is a pandas.DataFrame object. It represents the whole data of the CSV file, you can use its various method to manipulate the data such as order, query, change index, columns, etc.
      data_frame = pandas.read_csv(csv_file)
    2. You can pass an encoding parameter to the read_csv() method to specify the CSV file text character encoding.
      data_frame = pandas.read_csv(csv_file, encoding='gbk')
    3. Now you can call the returned DataFrame object’s head(n) method to get the first n rows of the text in the CSV file.
      data_frame.head(n)

    3. Pandas Write Data To CSV File.

    1. After you edit the data in the pandas.DataFrame object, you can call its to_csv method to save the new data into a CSV file.
      data_frame.to_csv(csv_file_path)

    4. Pandas Write Data To Excel File.

    1. Create a file writer using pandas.ExcelWriter method.
      excel_writer = pandas.ExcelWriter(excel_file_path, engine='xlsxwriter')
    2. Call DataFrame object’s to_excel method to set the DataFrame data to a special excel file sheet.
      data_frame.to_excel(excel_writer, 'Employee Info')
    3. Call the writer’s save method to save the data to an excel file.
      excel_writer.save()

    5. Python Pandas DataFrame Operation Methods.

    5.1 Sort DataFrame Data By One Column.

    1. Please note the data column name is case sensitive.
      data_frame.sort_values(by=['Salary'], ascending=False)

    5.2 Query DataFrame Data In A Range.

    1. The below python code will query a range of data in the DataFrame object.
      data_frame = data_frame.loc[(data_frame['Salary'] > 10000) & (data_frame['Salary'] < 20000)]

    6. Python Pandas Read/Write CSV File And Save To Excel File Example.

    1. Below is the content of this example used source CSV file, the file name is employee_info.csv.
      Name,Hire Date,Salary
      jerry,2010-01-01,16000
      tom,2011-08-19,6000
      kevin,2009-02-08,13000
      richard,2012-03-19,5000
      jackie,2015-06-08,28000
      steven,2008-02-01,36000
      jack,2006-09-19,8000
      gary,2018-01-16,19000
      john,2017-10-01,16600

    2. The example python file name is CSVExcelConvertionExample.py, it contains the below functions.
      python-pandas-read-write-csv-file-and-save-to-excel-file-example-python-source-code
    3. read_csv_file_by_pandas(csv_file).
      import pandas
      
      import os
      
      import glob
      
      # Read csv file use pandas module.
      def read_csv_file_by_pandas(csv_file):
          data_frame = None
          if(os.path.exists(csv_file)):
              data_frame = pandas.read_csv(csv_file)
              
              print("------------------data frame all----------------------")
              print(data_frame)
              
              print("------------------data frame index----------------------")
              print(data_frame.index)
              
              data_frame = data_frame.set_index('Name')
              print("------------------set Name column as data frame index----------------------")
              print(data_frame.index)
              
              print("------------------data frame columns----------------------")
              print(data_frame.columns)
              
              print("------------------data frame values----------------------")
              print(data_frame.values)
              
              print("------------------data frame hire date series----------------------")
              print(data_frame['Hire Date'])
              
              print("------------------select multiple columns from data frame----------------------")
              print(data_frame[['Salary', 'Hire Date']])
          else:
              print(csv_file + " do not exist.")    
          return data_frame
      
      
      if __name__ == '__main__':    
          
          data_frame = read_csv_file_by_pandas("./employee_info.csv")
      
      ========================================================================================================
      Execution output:
      
      ------------------data frame all----------------------
            Name   Hire Date  Salary
      0    jerry  2010-01-01   16000
      1      tom  2011-08-19    6000
      2    kevin  2009-02-08   13000
      3  richard  2012-03-19    5000
      4   jackie  2015-06-08   28000
      5   steven  2008-02-01   36000
      6     jack  2006-09-19    8000
      7     gary  2018-01-16   19000
      8     john  2017-10-01   16600
      ------------------data frame index----------------------
      RangeIndex(start=0, stop=9, step=1)
      ------------------set Name column as data frame index----------------------
      Index(['jerry', 'tom', 'kevin', 'richard', 'jackie', 'steven', 'jack', 'gary',
             'john'],
            dtype='object', name='Name')
      ------------------data frame columns----------------------
      Index(['Hire Date', 'Salary'], dtype='object')
      ------------------data frame values----------------------
      [['2010-01-01' 16000]
       ['2011-08-19' 6000]
       ['2009-02-08' 13000]
       ['2012-03-19' 5000]
       ['2015-06-08' 28000]
       ['2008-02-01' 36000]
       ['2006-09-19' 8000]
       ['2018-01-16' 19000]
       ['2017-10-01' 16600]]
      ------------------data frame hire date series----------------------
      Name
      jerry      2010-01-01
      tom        2011-08-19
      kevin      2009-02-08
      richard    2012-03-19
      jackie     2015-06-08
      steven     2008-02-01
      jack       2006-09-19
      gary       2018-01-16
      john       2017-10-01
      Name: Hire Date, dtype: object
      ------------------select multiple columns from data frame----------------------
               Salary   Hire Date
      Name                       
      jerry     16000  2010-01-01
      tom        6000  2011-08-19
      kevin     13000  2009-02-08
      richard    5000  2012-03-19
      jackie    28000  2015-06-08
      steven    36000  2008-02-01
      jack       8000  2006-09-19
      gary      19000  2018-01-16
      john      16600  2017-10-01
      
    4. write_to_csv_file_by_pandas(csv_file_path, data_frame).
      import pandas
      
      import os
      
      import glob
      
      # Read csv file use pandas module.
      def read_csv_file_by_pandas(csv_file):
          data_frame = None
          if(os.path.exists(csv_file)):
              data_frame = pandas.read_csv(csv_file)
              
              print("------------------data frame all----------------------")
              print(data_frame)
              
              print("------------------data frame index----------------------")
              print(data_frame.index)
              
              data_frame = data_frame.set_index('Name')
              print("------------------set Name column as data frame index----------------------")
              print(data_frame.index)
              
              print("------------------data frame columns----------------------")
              print(data_frame.columns)
              
              print("------------------data frame values----------------------")
              print(data_frame.values)
              
              print("------------------data frame hire date series----------------------")
              print(data_frame['Hire Date'])
              
              print("------------------select multiple columns from data frame----------------------")
              print(data_frame[['Salary', 'Hire Date']])
          else:
              print(csv_file + " do not exist.")    
          return data_frame
      
      # Write pandas.DataFrame object to a csv file.
      def write_to_csv_file_by_pandas(csv_file_path, data_frame):
          data_frame.to_csv(csv_file_path)
          print(csv_file_path + ' has been created.')
      
      
      if __name__ == '__main__':
        
          data_frame = read_csv_file_by_pandas("./employee_info.csv")
          
          write_to_csv_file_by_pandas("./employee_info_new.csv", data_frame)
      
      ================================================================================================================
      Execution output:
      
      ./employee_info_new.csv has been created.
      

    5. write_to_excel_file_by_pandas(excel_file_path, data_frame).
      import pandas
      
      import os
      
      import glob
      
      # Read csv file use pandas module.
      def read_csv_file_by_pandas(csv_file):
          data_frame = None
          if(os.path.exists(csv_file)):
              data_frame = pandas.read_csv(csv_file)
              
              print("------------------data frame all----------------------")
              print(data_frame)
              
              print("------------------data frame index----------------------")
              print(data_frame.index)
              
              data_frame = data_frame.set_index('Name')
              print("------------------set Name column as data frame index----------------------")
              print(data_frame.index)
              
              print("------------------data frame columns----------------------")
              print(data_frame.columns)
              
              print("------------------data frame values----------------------")
              print(data_frame.values)
              
              print("------------------data frame hire date series----------------------")
              print(data_frame['Hire Date'])
              
              print("------------------select multiple columns from data frame----------------------")
              print(data_frame[['Salary', 'Hire Date']])
          else:
              print(csv_file + " do not exist.")    
          return data_frame
      
      
      # Write pandas.DataFrame object to an excel file.
      def write_to_excel_file_by_pandas(excel_file_path, data_frame):
          excel_writer = pandas.ExcelWriter(excel_file_path, engine='xlsxwriter')
          data_frame.to_excel(excel_writer, 'Employee Info')
          excel_writer.save()
          print(excel_file_path + ' has been created.')
      
      if __name__ == '__main__':
          
          data_frame = read_csv_file_by_pandas("./employee_info.csv")
          
          write_to_excel_file_by_pandas("./employee_info_new.xlsx", data_frame)
      
      ==========================================================================================
      Execution output:
      
      ./employee_info_new.xlsx has been created.
    6. sort_data_frame_by_string_column(data_frame).
      import pandas
      
      import os
      
      import glob
      
      # Read csv file use pandas module.
      def read_csv_file_by_pandas(csv_file):
          data_frame = None
          if(os.path.exists(csv_file)):
              data_frame = pandas.read_csv(csv_file)
              
              print("------------------data frame all----------------------")
              print(data_frame)
              
              print("------------------data frame index----------------------")
              print(data_frame.index)
              
              data_frame = data_frame.set_index('Name')
              print("------------------set Name column as data frame index----------------------")
              print(data_frame.index)
              
              print("------------------data frame columns----------------------")
              print(data_frame.columns)
              
              print("------------------data frame values----------------------")
              print(data_frame.values)
              
              print("------------------data frame hire date series----------------------")
              print(data_frame['Hire Date'])
              
              print("------------------select multiple columns from data frame----------------------")
              print(data_frame[['Salary', 'Hire Date']])
          else:
              print(csv_file + " do not exist.")    
          return data_frame
      
      
      # Sort the data in DataFrame object by name that data type is string.
      def sort_data_frame_by_string_column(data_frame):
          data_frame = data_frame.sort_values(by=['Name'])
          print("--------------Sort data format by string column---------------")
          print(data_frame)
      
      
      if __name__ == '__main__':
      
          data_frame = read_csv_file_by_pandas("./employee_info.csv")
          
          sort_data_frame_by_string_column(data_frame)
      
      ====================================================================================================
      Execution output:
      
      --------------Sort data format by string column---------------
                Hire Date  Salary
      Name                       
      gary     2018-01-16   19000
      jack     2006-09-19    8000
      jackie   2015-06-08   28000
      jerry    2010-01-01   16000
      john     2017-10-01   16600
      kevin    2009-02-08   13000
      richard  2012-03-19    5000
      steven   2008-02-01   36000
      tom      2011-08-19    6000
    7. sort_data_frame_by_datetime_column(data_frame).
      import pandas
      
      import os
      
      import glob
      
      # Read csv file use pandas module.
      def read_csv_file_by_pandas(csv_file):
          data_frame = None
          if(os.path.exists(csv_file)):
              data_frame = pandas.read_csv(csv_file)
              
              print("------------------data frame all----------------------")
              print(data_frame)
              
              print("------------------data frame index----------------------")
              print(data_frame.index)
              
              data_frame = data_frame.set_index('Name')
              print("------------------set Name column as data frame index----------------------")
              print(data_frame.index)
              
              print("------------------data frame columns----------------------")
              print(data_frame.columns)
              
              print("------------------data frame values----------------------")
              print(data_frame.values)
              
              print("------------------data frame hire date series----------------------")
              print(data_frame['Hire Date'])
              
              print("------------------select multiple columns from data frame----------------------")
              print(data_frame[['Salary', 'Hire Date']])
          else:
              print(csv_file + " do not exist.")    
          return data_frame
      
      
      # Sort DataFrame data by Hire Date that data type is datetime. 
      def sort_data_frame_by_datetime_column(data_frame):
          data_frame = data_frame.sort_values(by=['Hire Date'])
          print("--------------Sort data format by date column---------------")
          print(data_frame)
      
      
      if __name__ == '__main__':
          
          data_frame = read_csv_file_by_pandas("./employee_info.csv")
         
          sort_data_frame_by_datetime_column(data_frame)
      ===========================================================================================
      
      Execution output:
      
      --------------Sort data format by date column---------------
                Hire Date  Salary
      Name                       
      jack     2006-09-19    8000
      steven   2008-02-01   36000
      kevin    2009-02-08   13000
      jerry    2010-01-01   16000
      tom      2011-08-19    6000
      richard  2012-03-19    5000
      jackie   2015-06-08   28000
      john     2017-10-01   16600
      gary     2018-01-16   19000
      
      
    8. sort_data_frame_by_number_column(data_frame).
      import pandas
      
      import os
      
      import glob
      
      # Read csv file use pandas module.
      def read_csv_file_by_pandas(csv_file):
          data_frame = None
          if(os.path.exists(csv_file)):
              data_frame = pandas.read_csv(csv_file)
              
              print("------------------data frame all----------------------")
              print(data_frame)
              
              print("------------------data frame index----------------------")
              print(data_frame.index)
              
              data_frame = data_frame.set_index('Name')
              print("------------------set Name column as data frame index----------------------")
              print(data_frame.index)
              
              print("------------------data frame columns----------------------")
              print(data_frame.columns)
              
              print("------------------data frame values----------------------")
              print(data_frame.values)
              
              print("------------------data frame hire date series----------------------")
              print(data_frame['Hire Date'])
              
              print("------------------select multiple columns from data frame----------------------")
              print(data_frame[['Salary', 'Hire Date']])
          else:
              print(csv_file + " do not exist.")    
          return data_frame
      
          
      # Sort DataFrame data by Salary that data type is number.    
      def sort_data_frame_by_number_column(data_frame):
          data_frame = data_frame.sort_values(by=['Salary'], ascending=False)
          print("--------------Sort data format by number column desc---------------")
          print(data_frame) 
      
      
      if __name__ == '__main__':
          data_frame = read_csv_file_by_pandas("./employee_info.csv")
      
          sort_data_frame_by_number_column(data_frame)
      
      ================================================================================
      
      Execution output:
      
      --------------Sort data format by number column desc---------------
                Hire Date  Salary
      Name                       
      steven   2008-02-01   36000
      jackie   2015-06-08   28000
      gary     2018-01-16   19000
      john     2017-10-01   16600
      jerry    2010-01-01   16000
      kevin    2009-02-08   13000
      jack     2006-09-19    8000
      tom      2011-08-19    6000
      richard  2012-03-19    5000
    9. get_data_in_salary_range(data_frame).
      import pandas
      
      import os
      
      import glob
      
      # Read csv file use pandas module.
      def read_csv_file_by_pandas(csv_file):
          data_frame = None
          if(os.path.exists(csv_file)):
              data_frame = pandas.read_csv(csv_file)
              
              print("------------------data frame all----------------------")
              print(data_frame)
              
              print("------------------data frame index----------------------")
              print(data_frame.index)
              
              data_frame = data_frame.set_index('Name')
              print("------------------set Name column as data frame index----------------------")
              print(data_frame.index)
              
              print("------------------data frame columns----------------------")
              print(data_frame.columns)
              
              print("------------------data frame values----------------------")
              print(data_frame.values)
              
              print("------------------data frame hire date series----------------------")
              print(data_frame['Hire Date'])
              
              print("------------------select multiple columns from data frame----------------------")
              print(data_frame[['Salary', 'Hire Date']])
          else:
              print(csv_file + " do not exist.")    
          return data_frame
      
      
      
      # Get DataFrame data list in salary range.    
      def get_data_in_salary_range(data_frame):
          data_frame = data_frame.loc[(data_frame['Salary'] > 10000) & (data_frame['Salary'] < 20000)]
          data_frame = data_frame.sort_values(by=['Salary'])
          
          print("-------------- Employee info whose salary between 10000 and 20000---------------")
          print(data_frame) 
      
      
      if __name__ == '__main__':
          
          data_frame = read_csv_file_by_pandas("./employee_info.csv")
          
          get_data_in_salary_range(data_frame)
      
      ==============================================================================================================
      
      Execution output:
      
      -------------- Employee info whose salary between 10000 and 20000---------------
              Hire Date  Salary
      Name                     
      kevin  2009-02-08   13000
      jerry  2010-01-01   16000
      john   2017-10-01   16600
      gary   2018-01-16   19000
    10. get_data_in_hire_date_range(data_frame).
      import pandas
      
      import os
      
      import glob
      
      # Read csv file use pandas module.
      def read_csv_file_by_pandas(csv_file):
          data_frame = None
          if(os.path.exists(csv_file)):
              data_frame = pandas.read_csv(csv_file)
              
              print("------------------data frame all----------------------")
              print(data_frame)
              
              print("------------------data frame index----------------------")
              print(data_frame.index)
              
              data_frame = data_frame.set_index('Name')
              print("------------------set Name column as data frame index----------------------")
              print(data_frame.index)
              
              print("------------------data frame columns----------------------")
              print(data_frame.columns)
              
              print("------------------data frame values----------------------")
              print(data_frame.values)
              
              print("------------------data frame hire date series----------------------")
              print(data_frame['Hire Date'])
              
              print("------------------select multiple columns from data frame----------------------")
              print(data_frame[['Salary', 'Hire Date']])
          else:
              print(csv_file + " do not exist.")    
          return data_frame
      
       
      # Get DataFrame data list in hire date range.       
      def get_data_in_hire_date_range(data_frame):
          min_hire_date = '2010-01-01'
          max_hire_date = '2017-01-01'
          data_frame = data_frame.loc[(data_frame['Hire Date'] > min_hire_date) & (data_frame['Hire Date'] < max_hire_date)]
          data_frame = data_frame.sort_values(by=['Hire Date'])
          print("-------------- Employee info whose Hire Date between 2010/01/01 and 2017/01/01---------------")
          print(data_frame)
      
      
      if __name__ == '__main__':
          
          data_frame = read_csv_file_by_pandas("./employee_info.csv")
          
          get_data_in_hire_date_range(data_frame)
      
      ====================================================================================================
      
      Execution output:
      
      -------------- Employee info whose Hire Date between 2010/01/01 and 2017/01/01---------------
                Hire Date  Salary
      Name                       
      tom      2011-08-19    6000
      richard  2012-03-19    5000
      jackie   2015-06-08   28000

    11. get_data_in_name_range(data_frame).
      import pandas
      
      import os
      
      import glob
      
      # Read csv file use pandas module.
      def read_csv_file_by_pandas(csv_file):
          data_frame = None
          if(os.path.exists(csv_file)):
              data_frame = pandas.read_csv(csv_file)
              
              print("------------------data frame all----------------------")
              print(data_frame)
              
              print("------------------data frame index----------------------")
              print(data_frame.index)
              
              data_frame = data_frame.set_index('Name')
              print("------------------set Name column as data frame index----------------------")
              print(data_frame.index)
              
              print("------------------data frame columns----------------------")
              print(data_frame.columns)
              
              print("------------------data frame values----------------------")
              print(data_frame.values)
              
              print("------------------data frame hire date series----------------------")
              print(data_frame['Hire Date'])
              
              print("------------------select multiple columns from data frame----------------------")
              print(data_frame[['Salary', 'Hire Date']])
          else:
              print(csv_file + " do not exist.")    
          return data_frame
      
        
      # Get DataFrame data list in name range.       
      def get_data_in_name_range(data_frame):
          start_name = 'jerry'
          end_name = 'kevin'
          # First sort the data in the data_frame by Name column.
          data_frame = data_frame.sort_values(by=['Name'])
          # Because the Name column is the index column, so use the value in loc directly. 
          data_frame = data_frame.loc[start_name:end_name]
          print("-------------- Employee info whose Name first character between jerry and kevin---------------")
          print(data_frame)
      
      
      if __name__ == '__main__':
          
          data_frame = read_csv_file_by_pandas("./employee_info.csv")
          
          get_data_in_name_range(data_frame)
      
      ====================================================================================================
      
      Execution output:
      
      -------------- Employee info whose Name first character between jerry and kevin---------------
              Hire Date  Salary
      Name                     
      jerry  2010-01-01   16000
      john   2017-10-01   16600
      kevin  2009-02-08   13000
    12. convert_csv_to_excel_in_folder(folder_path): You can see the below section 7 to see this function detail python source code.

    7. How To Convert Multiple CSV Files In A Folder To Excel File.

    1. The comment-95168 wants to convert some CSV files in a directory to Excel files automatically. Below is the example code which can implement this function.
      import pandas
      
      import os
      
      import glob
      
      '''
      This function will convert all the CSV files in the folder_path to Excel files.
      '''
      def convert_csv_to_excel_in_folder(folder_path):
           
          # Loop all the CSV files under the path.
          for csv_file in glob.glob(os.path.join(folder_path, '*.csv')):
      
              # If the CSV file exist.
              if(os.path.exists(csv_file)):
      
                  # Get the target excel file name and path.
                  excel_file_path = csv_file.replace(".csv", ".xlsx")
      
                  # Read the CSV file by python pandas module read_csv method. 
                  data_frame = pandas.read_csv(csv_file)
      
                  # Create an excel writer object.
                  excel_writer = pandas.ExcelWriter(excel_file_path, engine='xlsxwriter')
      
                  # Add a work sheet in the target excel file.
                  data_frame.to_excel(excel_writer, 'sheet_1')
      
                  # Save the target excel file.
                  excel_writer.save()
      
                  print(excel_file_path + ' has been created.')
      
              else:
                  print(csv_file + " do not exist.")
      
      if __name__ == '__main__':
          
          convert_csv_to_excel_in_folder(".")
      
      ====================================================================================================
      
      Execution output:
      
      ./employee_info_new.xlsx has been created.
      ./employee_info.xlsx has been created.
      
      

    Понравилась статья? Поделить с друзьями:

    А вот еще интересные статьи:

  • Как arr в excel
  • Как api использовать в excel
  • Как adobe acrobat document pdf перевести в word
  • Как excel программно vba
  • Как adobe acrobat document pdf перевести в excel

  • 0 0 голоса
    Рейтинг статьи
    Подписаться
    Уведомить о
    guest

    0 комментариев
    Старые
    Новые Популярные
    Межтекстовые Отзывы
    Посмотреть все комментарии