请求分析
首先从最终的查询数据包来看
POST /icpproject_query/api/icpAbbreviateInfo/queryByCondition HTTP/1.1
Host: hlwicpfwc.miit.gov.cn
Cookie: __jsluid_s=ae00a1e14491f21530d47bd8091b587a
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:122.0) Gecko/20100101 Firefox/122.0
Accept: application/json, text/plain, */*
Accept-Language: zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2
Accept-Encoding: gzip, deflate, br
Content-Type: application/json
Uuid: 42084b64747a4c96b90e7903812a4ec9
Sign: eyJ0eXBlIjozLCJleHREYXRhIjp7InZhZnljb2RlX2ltYWdlX2tleSI6IjQyMDg0YjY0NzQ3YTRjOTZiOTBlNzkwMzgxMmE0ZWM5In0sImUiOjE3MDY2MTAxMjYwNDF9.5phR5zdDcMcoVJN5y-LqTUgWvS11dEoJMpW1cuxNc0E
Rci: 77839f050ddc4596b9e22e3574c237f0
Token: eyJ0eXBlIjoxLCJ1IjoiMDk4ZjZiY2Q0NjIxZDM3M2NhZGU0ZTgzMjYyN2I0ZjYiLCJzIjoxNzA2NjA5NjE3NzMxLCJlIjoxNzA2NjEwMDk3NzMxfQ.pOCWH8ru6VkpgS3rbUk607mSGYlDqxpMrTI0ADrcc44
Content-Length: 96
Origin: https://beian.miit.gov.cn
Dnt: 1
Sec-Gpc: 1
Referer: https://beian.miit.gov.cn/
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-site
Te: trailers
Connection: close
{"pageNum":2,"pageSize":10,"unitName":"XXX有限公司","serviceType":1}
请求包中的意义不明的就只有这五段了
- Cookie: __jsluid_s=ae00a1e14491f21530d47bd8091b587a
- Uuid: 42084b64747a4c96b90e7903812a4ec9
- Sign: eyJ0eXBlIjozLCJleHREYXRhIjp7InZhZnljb2RlX2ltYWdlX2tleSI6IjQyMDg0YjY0NzQ3YTRjOTZiOTBlNzkwMzgxMmE0ZWM5In0sImUiOjE3MDY2MTAxMjYwNDF9.5phR5zdDcMcoVJN5y-LqTUgWvS11dEoJMpW1cuxNc0E
- Rci: 77839f050ddc4596b9e22e3574c237f0
- Token: eyJ0eXBlIjoxLCJ1IjoiMDk4ZjZiY2Q0NjIxZDM3M2NhZGU0ZTgzMjYyN2I0ZjYiLCJzIjoxNzA2NjA5NjE3NzMxLCJlIjoxNzA2NjEwMDk3NzMxfQ.pOCWH8ru6VkpgS3rbUk607mSGYlDqxpMrTI0ADrcc44
__jsluid_s
搜索ae00a1e14491f21530d47bd8091b587a
,找到第一个数据包
HTTP/1.1 200 OK
Date: Tue, 30 Jan 2024 10:16:37 GMT
Content-Type: text/html;charset=UTF-8
Connection: close
Vary: Accept-Encoding
Vary: Accept-Encoding
Vary: Origin
Vary: Access-Control-Request-Method
Vary: Access-Control-Request-Headers
Access-Control-Allow-Origin: https://beian.miit.gov.cn
Access-Control-Expose-Headers: Access-Control-Allow-Origin, Access-Control-Allow-Credentials, rci
Access-Control-Allow-Credentials: true
Strict-Transport-Security: max-age=15724800; includeSubDomains
X-Via-JSL: ba1114a,-
Set-Cookie: __jsluid_s=ae00a1e14491f21530d47bd8091b587a; max-age=31536000; path=/; HttpOnly; SameSite=None; secure
X-Cache: bypass
Content-Length: 419
{"code":200,"msg":"操作成功","params":{"bussiness":"eyJ0eXBlIjoxLCJ1IjoiMDk4ZjZiY2Q0NjIxZDM3M2NhZGU0ZTgzMjYyN2I0ZjYiLCJzIjoxNzA2NjA5NjE3NzMxLCJlIjoxNzA2NjEwMDk3NzMxfQ.pOCWH8ru6VkpgS3rbUk607mSGYlDqxpMrTI0ADrcc44","expire":300000,"refresh":"eyJ0eXBlIjoyLCJ1IjoiMDk4ZjZiY2Q0NjIxZDM3M2NhZGU0ZTgzMjYyN2I0ZjYiLCJzIjoxNzA2NjA5NjE3NzMxLCJlIjoxNzA2NjEwMzk3NzMxfQ.xnbwyV593KAWwv9bTIQKgqDPdzRssNAyteKbbjkBsRw"},"success":true}
对应的请求包中有个不知从何而来的0d5b07c64a82f07dc6e9ec472e39458f
POST /icpproject_query/api/auth HTTP/1.1
Host: hlwicpfwc.miit.gov.cn
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:122.0) Gecko/20100101 Firefox/122.0
Accept: */*
Accept-Language: zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2
Accept-Encoding: gzip, deflate, br
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Content-Length: 64
Origin: https://beian.miit.gov.cn
Dnt: 1
Sec-Gpc: 1
Referer: https://beian.miit.gov.cn/
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-site
Te: trailers
Connection: close
authKey=0d5b07c64a82f07dc6e9ec472e39458f&timeStamp=1706609797163
其中702f32ff914c3cfcf62507d34a2fdda0
这个没有搜到其他数据包,于是搜索authKey
从下往上看,最后一个authKey
是
接着往上找n
n是w.authKey(g,A,t),往上可以看到t的值是时间戳(new Date).getTime()
(13位),而g和A则是由函数auth传入
authKey就是把g,A,t三个值拼接后计算出md5
继续查找函数auth看到g和A均为字符串test
打个断点看看
与实验结果一致
将计算值替换到请求包中重放能够获得正确的响应包
不正确的话如下
Uuid
搜索42084b64747a4c96b90e7903812a4ec9
,找到第一个数据包
Uuid的值来自响应数据包的json格式body中的params[‘uuid’]
HTTP/1.1 200 OK
Date: Tue, 30 Jan 2024 10:16:55 GMT
Content-Type: application/json
Connection: close
Vary: Accept-Encoding
Vary: Accept-Encoding
Vary: Origin
Vary: Access-Control-Request-Method
Vary: Access-Control-Request-Headers
Access-Control-Allow-Origin: https://beian.miit.gov.cn
Access-Control-Expose-Headers: Access-Control-Allow-Origin, Access-Control-Allow-Credentials, rci
Access-Control-Allow-Credentials: true
Strict-Transport-Security: max-age=15724800; includeSubDomains
X-Via-JSL: 559ed21,-
X-Cache: bypass
Content-Length: 269568
{"code":200,"msg":"操作成功","params":{"bigImage":"iVBORw0KGgo……BJRU5ErkJggg==","secretKey":"QctFZYMwYcKSku7Y","smallImage":"iVBORw0KGgo……ElFTkSuQmCC","uuid":"42084b64747a4c96b90e7903812a4ec9","wordCount":4},"success":true}
对应的请求包为
POST /icpproject_query/api/image/getCheckImagePoint HTTP/1.1
Host: hlwicpfwc.miit.gov.cn
Cookie: __jsluid_s=ae00a1e14491f21530d47bd8091b587a
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:122.0) Gecko/20100101 Firefox/122.0
Accept: application/json, text/plain, */*
Accept-Language: zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2
Accept-Encoding: gzip, deflate, br
Content-Type: application/json
Token: eyJ0eXBlIjoxLCJ1IjoiMDk4ZjZiY2Q0NjIxZDM3M2NhZGU0ZTgzMjYyN2I0ZjYiLCJzIjoxNzA2NjA5NjE3NzMxLCJlIjoxNzA2NjEwMDk3NzMxfQ.pOCWH8ru6VkpgS3rbUk607mSGYlDqxpMrTI0ADrcc44
Content-Length: 58
Origin: https://beian.miit.gov.cn
Dnt: 1
Sec-Gpc: 1
Referer: https://beian.miit.gov.cn/
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-site
Te: trailers
Connection: close
{"clientUid":"point-a3f7374f-986a-4b6a-87bf-ba035ae77c9b"}
发现此处已经有Token和clientUid两个值,那下面继续分析这两个哪来的
clientUid
搜索clientUid找到发送请求包对应的函数,发现是读取localStorage的值
既然是getItem(“point”)那就搜setItem(“point”
貌似是随机得到的
随便生成一个uuid的值,确实也可以得到正确的响应
Token
拿着eyJ0eXB……NzMxfQ搜索发现第一个包就是之前获取Cookie值__jsluid_s的响应包😅
看来可以在拿Cookie时顺便搞定他
Sign
搜索eyJ0eXB……MjYwNDF9,发现的第一个数据包如下
对应的请求包
POST /icpproject_query/api/image/checkImage HTTP/1.1
Host: hlwicpfwc.miit.gov.cn
Cookie: __jsluid_s=ae00a1e14491f21530d47bd8091b587a
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:122.0) Gecko/20100101 Firefox/122.0
Accept: application/json, text/plain, */*
Accept-Language: zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2
Accept-Encoding: gzip, deflate, br
Content-Type: application/json
Token: eyJ0eXBlIjoxLCJ1IjoiMDk4ZjZiY2Q0NjIxZDM3M2NhZGU0ZTgzMjYyN2I0ZjYiLCJzIjoxNzA2NjA5NjE3NzMxLCJlIjoxNzA2NjEwMDk3NzMxfQ.pOCWH8ru6VkpgS3rbUk607mSGYlDqxpMrTI0ADrcc44
Content-Length: 255
Origin: https://beian.miit.gov.cn
Dnt: 1
Sec-Gpc: 1
Referer: https://beian.miit.gov.cn/
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-site
Te: trailers
Connection: close
{"token":"42084b64747a4c96b90e7903812a4ec9","secretKey":"QctFZYMwYcKSku7Y","clientUid":"point-a3f7374f-986a-4b6a-87bf-ba035ae77c9b","pointJson":"0CLJn2Mxxj1rsH3COFz0roWmZDHzgW+vlUt6vsQljB6o9V6j6ncEFMkNT68eqbwIXmIuFNhe8u7cfGFhFOAY2Fny68afmFITf+KFccUur6U="}
然后发现token、secretKey和clientUid都来自刚才获取Uuid的数据包中,看来pointJson就是验证码的值了
搜索pointJson,看到pointJson是函数h的结果
往上找传入参数g.checkPointArr发现是一个数组,g.secretKey就是请求包中secretKey的值
函数h是m的实例化,m看着就是个aes加密函数
A传入为空的话就赋值XwKsGlMcdPMEhR1B,偏移量无疑了
尝试打断点,与判断结果一致
剩下的就是看怎么识别验证码了
验证码分析
点选
试了下ddddocr对bigImage.png的文字的识别率,基本是没法看
但是点选位置的识别率还是很不错的
在某个项目下面看到有介绍用孪生神经网络来实现的,然后看了一下b站很多点选验证码也是用这个思路实现
smallImage.png的文字位置和间距都固定,每个字体大概是24x24
每个字的起始点(以左上为准)在(167, 13),(200, 13),(233, 13),(266, 13)
因为每个汉字都会有些许差别,所以用逗号的间隔9来,刚好加上汉字的24是33
识别
搜索了一下这类验证码的识别,基本都是使用孪生网络的方案
首先写个代码把验证码中bigImage的点选文字和smallImage的待点选文字识别并截取出来做样本
Pytorch 搭建自己的孪生神经网络比较图片相似性平台(Bubbliiiing 深度学习 教程)(环境要求Python3.7,高了依赖装不上😅)
- 克隆仓库
git clone https://github.com/bubbliiiing/Siamese-pytorch.git
pyenv安装Python3.7(最高版本Windows:3.7.9,Linux:3.7.17)
pyenv install 3.7.17
创建虚拟
cd Siamese-pytorch && python -m venv ./venv
安装依赖
pip install torch==1.2.0 torchvision==0.4.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
下载vgg16-397923af.pth(VGG预训练模型)保存至model_data目录下
train.py
diff --git a/train.py b/train.py
index 9288dc3..ceb6f3f 100644
--- a/train.py
+++ b/train.py
@@ -20,7 +20,7 @@ if __name__ == "__main__":
# 是否使用Cuda
# 没有GPU可以设置成False
#----------------------------------------------------#
- Cuda = True
+ Cuda = False
#---------------------------------------------------------------------#
# distributed 用于指定是否使用单机多卡分布式运行-
# 终端指令仅支持Ubuntu。CUDA_VISIBLE_DEVICES用于在Ubuntu下指定显卡。-
@@ -41,7 +41,7 @@ if __name__ == "__main__":
# fp16 是否使用混合精度训练_
# 可减少约一半的显存、需要pytorch1.7.1以上D
#---------------------------------------------------------------------#
- fp16 = False
+ fp16 = True
#----------------------------------------------------#
# 数据集存放的路径-
#----------------------------------------------------#
@@ -84,7 +84,7 @@ if __name__ == "__main__":
# 网络一般不从0开始训练,至少会使用主干部分的权值,有些论文提到可以不用预训练,主要原因是他们 数据集较大 且 调参能力优秀。-
# 如果一定要训练网络的主干部分,可以了解imagenet数据集,首先训练分类模型,分类模型的 主干部分 和该模型通用,基于此进行训练。#
#----------------------------------------------------------------------------------------------------------------------------#
- model_path = ""
+ model_path = "model_data/vgg16-397923af.pth"
#----------------------------------------------------------------------------------------------------------------------------#
# 显存不足与数据集大小无关,提示显存不足请调小batch_size。-
因为没N卡就把Cuda关了,依照视频教程训练会在logs目录下生成best_epoch_weights.pth
转换
参考微软文档将pth格式的PyTorch专用训练模型转换为onnx的通用模型文件
import torch.onnx
#Function to Convert to ONNX
def Convert_ONNX():
# 导出模型之前必须调用 model.eval() 或 model.train(False)
# 将模型设置为推理模式
model.eval()
# 创建一个虚拟输入张量
dummy_input = torch.randn(1, input_size, requires_grad=True)
# 导出模型
torch.onnx.export(model, # model being run
dummy_input, # 模型输入
"ImageClassifier.onnx", # 保存模型的位置
export_params=True, # 在模型文件中存储训练好的参数权重
opset_version=10, # 导出模型的 ONNX 版本
do_constant_folding=True, # 是否执行常量折叠优化
input_names = ['modelInput'], # 模型的输入名称
output_names = ['modelOutput'], # 模型的输出名称
dynamic_axes={'modelInput' : {0 : 'batch_size'}, 'modelOutput' : {0 : 'batch_size'}}) # 变长轴
print('Model has been converted to ONNX')
if __name__ == "__main__":
model = Network()
path = "myFirstModel.pth"
model.load_state_dict(torch.load(path))
# Conversion to ONNX
Convert_ONNX()
参考
某查询网站点选逆向分析
将 PyTorch 训练模型转换为 ONNX | Microsoft Learn
转载请注明来源,欢迎对文章中的引用来源进行考证,欢迎指出任何有错误或不够清晰的表达。可以在下面评论区评论,也可以邮件至 cnlnnn@qq.com