Python Pandas read_json读取JSON-CJavaPy

1、读取JSON

大数据集通常被存储或提取为JSON。

JSON是纯文本，但是具有对象的格式，并且在包括Pandas在内的编程行业中众所周知。

在示例中，将使用一个名为“data.json”的JSON文件。

data.json文件：https://www.cjavapy.com/download/5fe1f8c9dc72d93b4993067d/

例如：

将JSON文件加载到DataFrame中：

import pandas as pd

df = pd.read_json('data.json')

print(df.to_string())

提示：使用to_string（）打印整个DataFrame。

2、 JSON格式的Dictionary

JSON对象与Python字典具有相同的格式。

如果JSON代码不在文件中，而是在Python字典中，则可以将其直接加载到DataFrame中：

例如：

将Python字典加载到DataFrame中：

import pandas as pd

data = {
    "Duration": {  "0": 60,"1": 60,
        "2": 60,"3": 45,"4": 45,
        "5": 60
    },"Pulse": {  "0": 110,
        "1": 117,"2": 103,
        "3": 109, "4": 117,
        "5": 102
    },
    "Maxpulse": { "0": 130,
        "1": 145,"2": 135,
        "3": 175,"4": 148,
        "5": 127
    },"Calories": {  "0": 409,
        "1": 479,"2": 340,
        "3": 282,
        "4": 406,
        "5": 300
    }
}

# 将嵌套字典转换为 DataFrame
df = pd.DataFrame(data)

# 打印 DataFrame
print(df)

3、read_json 常用操作

read_json() 函数用于读取 JSON 格式的数据并转换为 DataFrame，常用操作包括设置 orient 参数来指定 JSON 数据的结构（如 records、split、index 等），使用 lines=True 读取 JSON 每行一条记录的日志文件格式，以及通过 dtype、convert_dates 等参数控制数据类型和时间格式解析。

1）读取每行一个 JSON（如日志文件）

import pandas as pd

# 读取每行一个 JSON 对象的文件
df = pd.read_json('data_lines.json', lines=True)

print(df)

2）指定 orient 格式

import pandas as pd

# 读取 JSON 文件，records 表示每条记录为一个字典
df = pd.read_json('data.json', orient='records')

print(df)

适用于典型的数组格式 JSON 文件，每条记录是一个对象，常用于接口数据或日志分析。

orient 参数说明（适用于 JSON 是字典结构时）

orient	描述
`records`	列表嵌套字典，每条记录一行
`split`	包含 keys: index, columns, data
`index`	字典嵌套字典，外层键为行索引
`columns`	字典嵌套字典，外层键为列名

Python Pandas read_json读取JSON

1、读取JSON

2、 JSON格式的Dictionary

3、read_json 常用操作

Java Stream使用多个过滤器(filter)或复杂条件方法用法及简单写法代码

Python 2.7中安装pip的方法及步骤

Java JDK11 在windows上的安装和环境变量配置

Python numpy.full函数方法的使用

Java JDK11 在Mac上的安装和配置以及JDK多个版本之间切换

Python PIP升级后执行命令报错： sys.stderr.write(f"ERROR: {exc}")解决方法

Python pandas.to_numeric函数方法的使用

Python numpy.fromfile函数方法的使用