Files
intelligence_system/collectors/__pycache__/rss_subscriptions.cpython-312.pyc
T

91 lines
17 KiB
Plaintext
Raw Normal View History

Ë
ÂùhX4ã óØddlZddlZddlmZddlZddlZddlZddlZddlZddl m
Z
m Z ddl m
Z
ddlmZmZmZmZej&j)ej&j+e««Zej&j)e«Zeej&vrej&j3de«ddlmZddd d
d d d
ddddœ
ZdZGdd«Zedk(rejA«yy)éN)Údatetime)ÚThreadPoolExecutorÚ as_completed)Úlogger)ÚDictÚListÚOptionalÚAny)Ú
MySQLAgentÚ localhostÚrootÚ123123Úintelligence_systemiê Úutf8mb4é
éT)
ÚhostÚuserÚpasswordÚdatabaseÚportÚcharsetÚconnect_timeoutÚ read_timeoutÚ
write_timeoutÚ
autocommitÚcollector_rss_subscriptionsc ó‚eZdZdZdZddededeede eeffdZ
defd „Z dee fd
Z
d e ddfd Zdd
ededeej"fdZddeedede eej"ffdZde eefd
ede eeffdZ ddej"d ee d
eedee fdZdej0de eeffdZed«Zy)Ú
NewsAPIClientuHæ–°é—»API客户端,用于获å–和处ç†RSSæºæ•°æ®å¹¶å†™å…¥æ•°æ®åº“có˜tjd¬«|_tt«|_|jj d«y)u*åˆå§‹åŒ–客户端并建立数æ®åº“连接r)Úmoduleu9æ–°é—»API客户端åˆå§‹åŒ–完æˆï¼Œå·²è¿žæŽ¥åˆ°æ•°æ®åº“N)rÚbindr Úlocal_DB_ConfigÚdb_agentÚinfo)Úselfs úCd:\Idea Project\intelligence_system\collectors\rss_subscriptions.pyÚ__init__zNewsAPIClient.__init__*s1ä—kÔŒ Ü"¤?Ó3ˆŒ
Ø ×ÑÐsuccessÚmessageÚdataÚreturncó2t|«t|«|dœS)u统一返回结果格å¼)r*r+r,)ÚboolÚstr)r&r*r+r,s r'Ú_format_resultzNewsAPIClient._format_result0sô˜G“}ܘ7“|Øñ
ð
r)cóV |jjdtdd¬«}|s$|jj dtd«y|jjdtd¬«}|Dcgc]}|d Œ }}gd
¢}|Dcgc] }||vsŒ|Œ }}|r&|jj dtd |«y|jj d |«ycc}wcc}w#t $r3}|jj d
t|«d¬«Yd}~yd}~wwxYw)uQéªŒè¯æ•°æ®åº“表结构是å¦ç¬¦åˆè¦æ±‚(适é…元组格å¼çš„æŸ¥è¯¢ç»“果)zSHOW TABLES LIKE 'Ú'T)Úfetchu表 u" ä¸å­˜åœ¨ï¼Œè¯·å…ˆåˆ›å»ºè¡¨ç»“æž„Fz DESCRIBE r©õ 文章标题u 文章链接u 文章摘è¦õ å‘布时间u æ¥æºURLu 创建时间u æ›´æ–°æ—¶é—´u 缺少必è¦å­—段:u0æ•°æ®åº“表结构验è¯é€šè¿‡ï¼Œå½“å‰å­—段:uæ•°æ®åº“验è¯å¤±è´¥: ©Úexc_infoN)r$Ú execute_sqlÚ
table_namerÚerrorr%Ú Exceptionr0)r&ÚresultÚ desc_resultÚcolÚcolumnsÚrequired_columnsÚ missing_colsÚes r'Úverify_databasezNewsAPIClient.verify_database8sBð à—]‘]×$¤Z L°ÐðˆFñ
Ø ×! ¨ Ð4VÐ"WÔðŸ-™-לJ˜<Ððˆ
*5Ó #s˜1“v¨ˆ MÐ á+;ÓRÑ+; C¸sÈ'Ò?QšCÐ+;ˆØ ×! ¨ Ð4JÈ<È.Ð"YÔà K‰K× Ñ ÐOÐPWÈyÐ ùò6ùòSøôò Ø K‰K× Ñ Ð 7¼¸A»°xÐ@È4Ð Ô ûð úsH‚A
C,Á
(C,Á5 C"Â
C, C'ÂC'Â)C,ÃC,Ã"
C,Ã, D(Ã5)D#Ä#D(có,tjjtj«dd«}tjj |«r[ t |d«5}t
j|«}|jjd|jd««|cddd«S|jjd
«y#1swYnxYwŒ)#t$r3}|jjdt|«d¬ «Yd}~Œ`d}~wwxYw) u加载上次更新时间缓存Úoutputúlast_update.pklÚrbu加载上次更新时间: ú%Y-%m-%d %H:%M:%SNu 加载上次更新时间失败: Tr8u9未找到上次更新时间缓存,将获å–全部数æ®)ÚosÚpathÚjoinÚgetcwdÚexistsÚopenÚpickleÚloadrÚdebugÚstrftimer=r<r0)r&Ú
cache_fileÚ last_updaterDs r'Úload_last_update_timez#NewsAPIClient.load_last_update_time[ä—WW—\\¤"§)¡)£+¨xÐ9JÓKˆ
Ü
7‰7>‰>˜ 
^ܘ* +¨qÜ"(§+¡+¨a£.—KK×%Ð(BÀ;×CWÑCWÐXkÓClÐBmÐ&nÔ
×ÑÐ÷+úÐ+øôò
^Ø ×!Ð$DÄSÈÃVÀHÐ"MÐX\Ð!×]ûð
^ús1Á CÁ AC
Â$ CÃ
CÃCà Dà )DÄDrWcó tjjtj«d«}tj|d¬«tjj|d«}t |d«5}t
j||«ddd«|jjd|jd««y#1swYŒ7xYw#t$r3}|jjd t|«d¬
«Yd}~yd}~wwxYw) uä¿å­˜æœ¬æ¬¡æ›´æ–°æ—¶é—´rGT)Úexist_okrHÚwbNuå·²ä¿å­˜æœ¬æ¬¡æ›´æ–°æ—¶é—´: rJuä¿å­˜æ›´æ–°æ—¶é—´å¤±è´¥: r8)rKrLrMrNÚmakedirsrPrQÚdumprrSrTr=r<r0)r&rWÚ cache_dirrUrVrDs r'Úsave_last_update_timez#NewsAPIClient.save_last_update_timeið TÜŸŸ ¤R§Y¡Y£[°(Ó;ˆIÜ K‰K˜ ¨DÕ ŸŸ iÐ1BÓCˆj '¨1Ü ˜Ô K‰K× Ñ Ð =¸k×>RÑ>RÐSfÓ>gÐ=hÐ 'ûôò TØ K‰K× Ñ Ð :¼3¸q»6¸(ÐCÈdÐ × Sûð Tús0A5CÁ7CÂ5CÃC
à Cà D Ã)DÄD ÚurlÚtimeoutc ó°ddi}td«D]´} tj|||¬«}|j«|j|_t
j|j«}|jr+|jjd|d|j«|jjd|d«|cS|jj%d|d«y#tj$r[}|jjd |d
zd |d t|««|d
krt!j"d|d
zz«Yd}~Œ@d}~wwxYw)u获å–å¹¶è§£æžå•个RSSæºz
User-AgentzsMozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36é)Úheadersrauè§£æž u 存在潜在问题: u
æˆåŠŸèŽ·å u
çš„RSSæ•°æ®u第 éu æ¬¡èŽ·å– u 失败: éNu三次å°è¯•åŽä»æ— æ³•èŽ·å– )ÚrangeÚrequestsÚgetÚraise_for_statusÚapparent_encodingÚencodingÚ
feedparserÚparseÚtextÚbozorÚwarningÚbozo_exceptionrSÚRequestExceptionr0ÚtimeÚsleepr<)r&r`rardÚattemptÚresponseÚfeedrDs r'Úfetch_single_rsszNewsAPIClient.fetch_single_rssvs@ð
ðPð
ˆô˜Qxˆ
Ü#Ÿ<™<¨°WÀgÔNØ×+Ø$,×$>Ñ$>Ô!ׯ
©
Ó6à—9—KK×'¨'°#°Ð6KÈD×L_ÑL_ÐK`Ð(aÔ ×! M°#°°mÐ"DÔ ð ð&
×ÑÐ;¸C¸
Ðøô×
Ø ×# d¨7°Q©;¨-°{À3À%ÀyÔQTÐUVÓQWÐPXÐ$YÔ˜Q—J‘J˜q G¨a¡KÑûð 
ús”B/C'Ã'EÃ:AEÅEÚurlsc óôi}td¬«5}|Dcic] }|j|j||«|Œ"}}t|«D]}||} |j «}|r|||<Œ! ddd«|j jdt|«d t|«d
«|Scc}w#t
$r6} |j jd|dt| «d¬«Yd} ~ Œ d} ~ wwxYw#1swYŒ†xYw) uå¹¶å‘获å–多个RSSæºrc)Ú max_workersuå¤„ç† u æ—¶å‘生异常: Tr8Nu"RSSæºèŽ·å–完æˆï¼ŒæˆåŠŸèŽ·å– Ú/u 个æº) rÚsubmitryrr>r=rr<r0r%Úlen)
r&rzraÚfeedsÚexecutorr`Ú
future_to_urlÚfuturerxrDs
r'Ú
fetch_all_rsszNewsAPIClient.fetch_all_rsssÿàˆÜ
¨AÕ
.°(ÙbfÓgÑbfÐ[^˜XŸ_™_¨T×-BÑ-BÀCÈÓQÐSVÑVÐbfˆ& 5Ø# +ð`Ø!Ÿ=™=?Ø%)˜˜c™
øñ 
×ÑÐ=¼cÀ%»j¸\ÈÌ3ÈtË9È+ÐU\Ј ùòhøô`Ø—K‘K×°¨uÐ4FÄsÈ1ÃvÀhÐ&OÐZ^Ð%×_ûð`ú÷
.úsEC.”%B'¹C.ÁB,Á&C.Â'C.Â, C+Â5,C&Ã!C.Ã&C+Ã+C.Ã.C7Úentrycó@|jdd«xsd}t|«dkDr|dddz}|jdd«xsd}t|«d kDr|dd
dz}|jd d «}|jd
g«}|rt|dd«r|djnd}|d k7r|n |r|dddznd }|jd«xs|jd«} | r t | ddŽ}
nI|jd|jdd««} t j
| d«j
d¬«}
|xsd} t| «d kDr| dd
dz} t j«jd«}
||||
jd«| |
|
dœS#t j«}
YŒpxYw)u6处ç†å•个RSSæ¡ç›®ï¼Œè½¬æ¢ä¸ºæ•°æ®åº“兼容格å¼Útitleu 无标题éÿNéüz...Úlinku 无链接iÚsummaryu无内容摘è¦ÚcontentrÚvalueÚéÈÚpublished_parsedÚupdated_parsedéÚ publishedÚupdatedz%a, %d %b %Y %H:%M:%S %z)Útzinfou æœªçŸ¥æ¥æºrJr5) rirÚhasattrrrÚstrptimeÚreplaceÚnowrT)r&r…r`r‡rÚ content_listrŒÚ descriptionrÚ
entry_timeÚpub_strÚ
source_urlÚ current_times r'Úprocess_feed_entryz NewsAPIClient.process_feed_entry¤ð ˜' >°;ˆÜ ˆu‹:˜Ò ؘ$˜3K 'ˆy‰y˜ ÓˆÜ ˆt‹9 ؘ˜; Ñ&ˆ—))˜IÐ'8Ó9ˆØ—yy ¨BÓ/ˆ Ù,8¼WÀ\ÐRSÁ_ÐV]Ô=^,˜q/×'ÐegˆØ!(Ð,=Ò!=gÑ]dÀGÈDÈSÀMÐTYÒDYÐj{ˆ ð!Ÿ9™9Ð%7ÓW¸E¿I¹IÐFVÓ<WÐÙ Ü!Ð#3°B°QÐ#7Ð8‰Jà—ii  ¨U¯Y©Y°yÀ"Ó-EÓFˆ
%×.¨wÐ8RÓS×[ÐcgÐh
ð