关于b站新弹幕格式.pb

格式为google protocol buffer (protobuf)

地址格式为{cid}-{part}.pb,每part包含3分钟的弹幕内容,即边播放边加载模式,如-3.pb包含第6-9分弹幕

在bilibiliPlayer.min.js中有大部分field id的定义,但文件中出现了id12-14

pb格式弹幕 Field常量
1  row (Row)
2  chat_server (string)
3  chat_id (varint32)
4  mission (varint32)
5  max_limit (varint32)
6  source (string)
7  ds (varint32)
8  de (varint32)
9  max_count (varint32)
10 realname (varint32)
11 sectionlen (varint32)(一般为180)
//猜测id
12 duration (float)
13 total_count (varint32)
14 pb_version (varint32)(目前常量1)

pb格式弹幕row Field常量
1  playtime (float)
2  mode (varint32)
3  fontsize (varint32)
4  color (varint32)
5  times (varint32)
6  poolid (varint32)
7  hash (string)
8  dmid (varint32)
9  msg (string)
10 uid (varint32)(暂未出现)
11 uname (string)(暂未出现)

其中varint32为变长度int32格式,具体为每字节只有后7位存储数据,第1位为指示位,为1时继续读取下一字节

function readVarInt32($fp){
	$b=0;
	$result=0;
	
	do{
		$b=unpack('C',fread($fp,1))[1];
		$result  = $b & 0x7f;
		if(!( $b & 0x80 ))
			break;
		
		$b=unpack('C',fread($fp,1))[1];
		$result |= ($b & 0x7f)<<7;
		if(!( $b & 0x80 ))
			break;
		
		$b=unpack('C',fread($fp,1))[1];
		$result |= ($b & 0x7f)<<14;
		if(!( $b & 0x80 ))
			break;
		
		$b=unpack('C',fread($fp,1))[1];
		$result |= ($b & 0x7f)<<21;
		if(!( $b & 0x80 ))
			break;
		
		$b=unpack('C',fread($fp,1))[1];
		$result |= ($b & 0x7f)<<28;
		if(!( $b & 0x80 ))
			break;
		
		for($i=0;$i<5;$i++){
			$b=unpack('C',fread($fp,1))[1];
			if(!( $b & 0x80 ))
				break;
		}
	}while(false);
	return $result;
}

文件读取代码样本

<?php
/*
 BiliBili pb danmaku reader
 @author esterTion(esterTionCN@gmail.com)
*/
$pb=fopen('http://comment.bilibili.com/14328539-1.pb','r');

function readVarInt32($fp){
	$b=0;
	$result=0;
	
	do{
		$b=unpack('C',fread($fp,1))[1];
		$result  = $b & 0x7f;
		if(!( $b & 0x80 ))
			break;
		
		$b=unpack('C',fread($fp,1))[1];
		$result |= ($b & 0x7f)<<7;
		if(!( $b & 0x80 ))
			break;
		
		$b=unpack('C',fread($fp,1))[1];
		$result |= ($b & 0x7f)<<14;
		if(!( $b & 0x80 ))
			break;
		
		$b=unpack('C',fread($fp,1))[1];
		$result |= ($b & 0x7f)<<21;
		if(!( $b & 0x80 ))
			break;
		
		$b=unpack('C',fread($fp,1))[1];
		$result |= ($b & 0x7f)<<28;
		if(!( $b & 0x80 ))
			break;
		
		for($i=0;$i<5;$i++){
			$b=unpack('C',fread($fp,1))[1];
			if(!( $b & 0x80 ))
				break;
		}
	}while(false);
	return $result;
}
function readString($fp){
	$strLength=readVarInt32($fp);
	return fread($fp,$strLength);
}
$types=array(
	'VARINT'=>0,
	'FIXED64'=>1,
	'DELIMITED'=>2,
	'START_GROUP'=>3,
	'END_GROUP'=>4,
	'FIXED32'=>5
);
$rowCount=0;
while(true){
	$id=fread($pb,1);
	if($id==='')
		break;
	$id=unpack('C',$id)[1];
	$type = $id & 7;
	$id = $id >> 3;
	switch($id){
		case 1:
			$rowLength=readVarInt32($pb);
			echo 'row length:'.$rowLength.' ';
			$end=ftell($pb)+$rowLength;
			$row=array();
			while(ftell($pb) < $end){
				$id=unpack('C',fread($pb,1))[1];
				$id = $id >> 3;
				switch($id){
					case 1:
						$row['playtime']=unpack('f',fread($pb,4))[1];
					break;
					case 2:
						$row['mode']=readVarInt32($pb);
					break;
					case 3:
						$row['fontsize']=readVarInt32($pb);
					break;
					case 4:
						$row['color']=substr(pack('N', readVarInt32($pb) ),2);
					break;
					case 5:
						$row['timestamp']=readVarInt32($pb);
					break;
					case 6:
						$row['poolid']=readVarInt32($pb);
					break;
					case 7:
						$row['hash']=readString($pb);
					break;
					case 8:
						$row['dmid']=readVarInt32($pb);
					break;
					case 9:
						$row['msg']=readString($pb);
					break;
					case 10:
						$row['uid']=readVarInt32($pb);
					break;
					case 11:
						$row['uname']=readString($pb);
					break;
				}
			}
			print_r($row);
			$rowCount++;
		break;
		case 2:
			echo '[chat_server] => '.readString($pb)."\n";
		break;
		case 3:
			echo '[chat_id] => '.readVarInt32($pb)."\n";
		break;
		case 4:
			echo '[mission] => '.readVarInt32($pb)."\n";
		break;
		case 5:
			echo '[max_limit] => '.readVarInt32($pb)."\n";
		break;
		case 6:
			echo '[source] => '.readString($pb)."\n";
		break;
		case 7:
			echo '[ds] => '.readVarInt32($pb)."\n";
		break;
		case 8:
			echo '[de] => '.readVarInt32($pb)."\n";
		break;
		case 9:
			echo '[max_count] => '.readVarInt32($pb)."\n";
		break;
		case 10:
			echo '[realname] => '.readVarInt32($pb)."\n";
		break;
		case 11:
			echo '[sectionlen] => '.readVarInt32($pb)."\n";
		break;
		default:
			switch($type){
				case 0:
					echo '--Unknown ID '.$id.' with varint32 value '.readVarInt32($pb)."\n";
				break;
				case 1:
					echo '--Unknown ID '.$id.' with fixed 64bit data hex '.bin2hex(fread($pb,8))."\n";
				break;
				case 5:
					echo '--Unknown ID '.$id.' with fixed 32bit data hex '.bin2hex(fread($pb,4))."\n";
				break;
				default:
					echo '--Unknown ID '.$id.' with type '.$type."\n";
			}
	}
}
echo $rowCount.' rows in file'."\n";

 

2 thoughts on “关于b站新弹幕格式.pb”

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax