diff --git a/Changelog.md b/Changelog.md index 49a8afb..be8dd2b 100644 --- a/Changelog.md +++ b/Changelog.md @@ -1,3 +1,35 @@ +# 2025.2.25.0 + +*2025-02-25* + +- Added + - Sites: + - **Bluesky** + - Facebook: **`Reels` downloads** + - OnlyFans: default value for `App-Token` + - Pinterest: **sub-boards downloading** + - Threads: ability to manually change `UserName` + - Twitter: + - new icon support + - **sleep timers to fully download large profiles** + - Feed: + - ability to invert selection + - open post URL when double-clicking on subscription image + - Minor improvements +- Updated + - yt-dlp up to version **2025.02.19** + - gallery-dl up to version **1.28.5** +- PluginProvider + - `IPluginContentProvider`: added property `NameTrue` +- Fixed + - Sites: + - Facebook: videos are not downloading + - LPSG: simplified 403 error + - PornHub: photos & videos are not downloading + - Reddit: **token does not update automatically** + - Threads: **data is not downloading** + - Minor bugs + # 2025.1.12.0 *2025-01-12* diff --git a/FAQ.md b/FAQ.md index 601c5db..245848b 100644 --- a/FAQ.md +++ b/FAQ.md @@ -63,6 +63,7 @@ I strongly recommend you to **regularly** create backup copies of the settings f - Reddit: don't use credentials at all or configure [OAuth](https://github.com/AAndyProgram/SCrawler/wiki/Settings#how-to-get-reddit-credentials). **Reddit profiles can be downloaded without any credentials at all. Subreddits require OAuth! If nothing downloads, use OAuth!** Don't use OAuth token to download saved posts (use cookies only). - **META** (**Instagram**, Threads, Facebook): you need **cookies** and fill in **all fields** +- **Instagram [TIPS](https://github.com/AAndyProgram/SCrawler/wiki/Settings#instagram-tips)** - **Instagram saved posts**: I don't consider questions like "I have 10k saved posts and only 1000 were downloaded". Download posts, remove them from saved posts, delete the `Saved posts` **settings folder**, repeat. - TikTok: works via yt-dlp. If something doesn't download, we need to wait until yt-dlp fixes it. TikTok doesn't require cookies to download. - Porn sites: **COOKIES**! diff --git a/ProgramScreenshots/SettingsGlobalBasis.png b/ProgramScreenshots/SettingsGlobalBasis.png index 5649281..223d447 100644 Binary files a/ProgramScreenshots/SettingsGlobalBasis.png and b/ProgramScreenshots/SettingsGlobalBasis.png differ diff --git a/ProgramScreenshots/SettingsGlobalBehavior.png b/ProgramScreenshots/SettingsGlobalBehavior.png index 51ad23c..1eb84e0 100644 Binary files a/ProgramScreenshots/SettingsGlobalBehavior.png and b/ProgramScreenshots/SettingsGlobalBehavior.png differ diff --git a/ProgramScreenshots/SettingsGlobalDefaults.png b/ProgramScreenshots/SettingsGlobalDefaults.png index 9e8ca07..de5b581 100644 Binary files a/ProgramScreenshots/SettingsGlobalDefaults.png and b/ProgramScreenshots/SettingsGlobalDefaults.png differ diff --git a/ProgramScreenshots/SettingsSiteBluesky.png b/ProgramScreenshots/SettingsSiteBluesky.png new file mode 100644 index 0000000..283ac2f Binary files /dev/null and b/ProgramScreenshots/SettingsSiteBluesky.png differ diff --git a/ProgramScreenshots/SettingsSiteFacebook.png b/ProgramScreenshots/SettingsSiteFacebook.png index a505f92..af80f1c 100644 Binary files a/ProgramScreenshots/SettingsSiteFacebook.png and b/ProgramScreenshots/SettingsSiteFacebook.png differ diff --git a/ProgramScreenshots/SettingsSiteOnlyFans.png b/ProgramScreenshots/SettingsSiteOnlyFans.png index f0514a2..e6fb3b5 100644 Binary files a/ProgramScreenshots/SettingsSiteOnlyFans.png and b/ProgramScreenshots/SettingsSiteOnlyFans.png differ diff --git a/ProgramScreenshots/SettingsSitePornHub.png b/ProgramScreenshots/SettingsSitePornHub.png index 53d52f4..839d4c5 100644 Binary files a/ProgramScreenshots/SettingsSitePornHub.png and b/ProgramScreenshots/SettingsSitePornHub.png differ diff --git a/ProgramScreenshots/SettingsSiteTwitter.png b/ProgramScreenshots/SettingsSiteTwitter.png index 51aafd2..2a62595 100644 Binary files a/ProgramScreenshots/SettingsSiteTwitter.png and b/ProgramScreenshots/SettingsSiteTwitter.png differ diff --git a/ProgramScreenshots/SettingsSiteYouTube.png b/ProgramScreenshots/SettingsSiteYouTube.png index 0ffb850..6e2f1a0 100644 Binary files a/ProgramScreenshots/SettingsSiteYouTube.png and b/ProgramScreenshots/SettingsSiteYouTube.png differ diff --git a/ProgramScreenshots/SettingsTwitterUser.png b/ProgramScreenshots/SettingsTwitterUser.png index 5389e0a..d6c770d 100644 Binary files a/ProgramScreenshots/SettingsTwitterUser.png and b/ProgramScreenshots/SettingsTwitterUser.png differ diff --git a/README.md b/README.md index 662ddac..8ca99a0 100644 --- a/README.md +++ b/README.md @@ -38,6 +38,7 @@ A program to download photo and video from [any site](#supported-sites) (e.g. Yo - Reddit images, galleries of images, videos, saved posts; - Redgifs images and videos (https://www.redgifs.com/); - Twitter images and videos, saved (bookmarked) posts, likes, communities; + - Bluesky images and videos; - OnlyFans images and videos, saved (bookmarked) posts, stories; - JustForFans images and videos, saved (bookmarked) posts; - Mastodon images and videos, saved (bookmarked) posts; @@ -78,6 +79,7 @@ A program to download photo and video from [any site](#supported-sites) (e.g. Yo - **YouTube Music** - **Reddit** - **Twitter** +- **Bluesky** - **OnlyFans** *(partial support)*[^1] - **Instagram** - **Threads** @@ -131,6 +133,7 @@ First, the program downloads the full profile. After the program downloads only - **[SITES REQUIREMENTS](https://github.com/AAndyProgram/SCrawler/wiki/Settings#sites-requirements)** - [Reddit](https://github.com/AAndyProgram/SCrawler/wiki/Settings#reddit) - [Twitter](https://github.com/AAndyProgram/SCrawler/wiki/Settings#twitter) + - [Bluesky](https://github.com/AAndyProgram/SCrawler/wiki/Settings#bluesky) - [OnlyFans](https://github.com/AAndyProgram/SCrawler/wiki/Settings#onlyfans) - [Mastodon](https://github.com/AAndyProgram/SCrawler/wiki/Settings#mastodon) - [Instagram](https://github.com/AAndyProgram/SCrawler/wiki/Settings#instagram) @@ -215,16 +218,4 @@ F5-->[*] Discord server: https://discord.gg/uFNUXvFFmg - - [^1]: Partial support means that I don't have personal accounts on paid porn sites because I don't pay for porn. If this site has stopped downloading and you want me to fix it, please be ready to give me access to an account with at least one active subscription. Otherwise, the download from this site will not be fixed. \ No newline at end of file diff --git a/SCrawler.PluginProvider/Interfaces/IPluginContentProvider.vb b/SCrawler.PluginProvider/Interfaces/IPluginContentProvider.vb index fc82097..6ca9a33 100644 --- a/SCrawler.PluginProvider/Interfaces/IPluginContentProvider.vb +++ b/SCrawler.PluginProvider/Interfaces/IPluginContentProvider.vb @@ -17,6 +17,7 @@ Namespace Plugin Property Settings As ISiteSettings Property AccountName As String Property Name As String + Property NameTrue As String Property ID As String Property Options As String Property ParseUserMediaOnly As Boolean diff --git a/SCrawler.PluginProvider/My Project/AssemblyInfo.vb b/SCrawler.PluginProvider/My Project/AssemblyInfo.vb index b98d3e2..a65597b 100644 --- a/SCrawler.PluginProvider/My Project/AssemblyInfo.vb +++ b/SCrawler.PluginProvider/My Project/AssemblyInfo.vb @@ -13,7 +13,7 @@ Imports System.Runtime.InteropServices - + @@ -32,6 +32,6 @@ Imports System.Runtime.InteropServices ' by using the '*' as shown below: ' - - + + diff --git a/SCrawler.YouTube/Base/YouTubeSettings.vb b/SCrawler.YouTube/Base/YouTubeSettings.vb index ac676d5..d1fec59 100644 --- a/SCrawler.YouTube/Base/YouTubeSettings.vb +++ b/SCrawler.YouTube/Base/YouTubeSettings.vb @@ -168,9 +168,6 @@ Namespace API.YouTube.Base Public ReadOnly Property CreateDescriptionFiles As XMLValue(Of Boolean) - - Public ReadOnly Property CreateDescriptionFiles_AddUploadDate As XMLValue(Of Boolean) Public ReadOnly Property CreateDescriptionFiles_CreateWithNoDescription As XMLValue(Of Boolean) diff --git a/SCrawler.YouTube/Controls/PlayListParserForm.Designer.vb b/SCrawler.YouTube/Controls/PlayListParserForm.Designer.vb index 1d6483e..a5c4e9a 100644 --- a/SCrawler.YouTube/Controls/PlayListParserForm.Designer.vb +++ b/SCrawler.YouTube/Controls/PlayListParserForm.Designer.vb @@ -121,9 +121,10 @@ Namespace API.YouTube.Controls Me.TXT_LIMIT.Location = New System.Drawing.Point(3, 3) Me.TXT_LIMIT.Name = "TXT_LIMIT" Me.TXT_LIMIT.PlaceholderEnabled = True - Me.TXT_LIMIT.PlaceholderText = "e.g. ABCDE" + Me.TXT_LIMIT.PlaceholderText = "e.g. RDAMP" Me.TXT_LIMIT.Size = New System.Drawing.Size(378, 22) Me.TXT_LIMIT.TabIndex = 2 + Me.TXT_LIMIT.Text = "RDAMP" ' 'CONTAINER_MAIN ' diff --git a/SCrawler.YouTube/My Project/AssemblyInfo.vb b/SCrawler.YouTube/My Project/AssemblyInfo.vb index b9f6692..ce03b2e 100644 --- a/SCrawler.YouTube/My Project/AssemblyInfo.vb +++ b/SCrawler.YouTube/My Project/AssemblyInfo.vb @@ -32,6 +32,6 @@ Imports System.Runtime.InteropServices ' by using the '*' as shown below: ' - - + + diff --git a/SCrawler.YouTube/Objects/YouTubeMediaContainerBase.vb b/SCrawler.YouTube/Objects/YouTubeMediaContainerBase.vb index f4a4b8f..692f921 100644 --- a/SCrawler.YouTube/Objects/YouTubeMediaContainerBase.vb +++ b/SCrawler.YouTube/Objects/YouTubeMediaContainerBase.vb @@ -1215,12 +1215,12 @@ Namespace API.YouTube.Objects End If With MyYouTubeSettings - If .CreateDescriptionFiles And (Not Description.IsEmptyString Or - (.CreateDescriptionFiles_CreateWithNoDescription And .CreateDescriptionFiles_AddUploadDate)) Then + If .CreateDescriptionFiles And (Not Description.IsEmptyString Or .CreateDescriptionFiles_CreateWithNoDescription) Then Dim fileDesr As SFile = File fileDesr.Extension = "txt" Using fileDesrText As New TextSaver(fileDesr) - If .CreateDescriptionFiles_AddUploadDate Then fileDesrText.Append($"Uploaded: {DateAdded:yyyy-MM-dd HH:mm:ss}") + fileDesrText.Append($"Uploaded: {DateAdded:yyyy-MM-dd HH:mm:ss}") + fileDesrText.AppendLine() fileDesrText.AppendLine($"URL: {URL}") fileDesrText.AppendLine($"Channel name: {AccountName}") fileDesrText.AppendLine($"Channel ID: {UserID}") diff --git a/SCrawler.YouTubeDownloader/My Project/AssemblyInfo.vb b/SCrawler.YouTubeDownloader/My Project/AssemblyInfo.vb index bbc8c74..0e9e1bc 100644 --- a/SCrawler.YouTubeDownloader/My Project/AssemblyInfo.vb +++ b/SCrawler.YouTubeDownloader/My Project/AssemblyInfo.vb @@ -32,6 +32,6 @@ Imports System.Runtime.InteropServices ' by using the '*' as shown below: ' - - + + diff --git a/SCrawler/API/Base/EditorExchangeOptionsBase.vb b/SCrawler/API/Base/EditorExchangeOptionsBase.vb new file mode 100644 index 0000000..19cc411 --- /dev/null +++ b/SCrawler/API/Base/EditorExchangeOptionsBase.vb @@ -0,0 +1,22 @@ +' Copyright (C) Andy https://github.com/AAndyProgram +' This program is free software: you can redistribute it and/or modify +' it under the terms of the GNU General Public License as published by +' the Free Software Foundation, either version 3 of the License, or +' (at your option) any later version. +' +' This program is distributed in the hope that it will be useful, +' but WITHOUT ANY WARRANTY +Imports SCrawler.Plugin.Attributes +Imports DN = SCrawler.API.Base.DeclaredNames +Namespace API.Base + Friend Class EditorExchangeOptionsBase + Friend Overridable Property SiteKey As String + + Friend Overridable Property UserName As String = String.Empty + Friend Sub New(ByVal u As UserDataBase) + UserName = u.NameTrue(True) + End Sub + Friend Sub New() + End Sub + End Class +End Namespace \ No newline at end of file diff --git a/SCrawler/API/Base/GDL.vb b/SCrawler/API/Base/GDL.vb index 199f17c..0fbf7eb 100644 --- a/SCrawler/API/Base/GDL.vb +++ b/SCrawler/API/Base/GDL.vb @@ -6,66 +6,7 @@ ' ' This program is distributed in the hope that it will be useful, ' but WITHOUT ANY WARRANTY -Imports PersonalUtilities.Tools -Imports PersonalUtilities.Functions.RegularExpressions Namespace API.Base.GDL - Friend Module Declarations - Private Structure GDLURL : Implements IRegExCreator - Private _URL As String - Friend ReadOnly Property URL As String - Get - Return _URL - End Get - End Property - Public Shared Widening Operator CType(ByVal u As String) As GDLURL - Return New GDLURL With {._URL = u} - End Operator - Public Shared Widening Operator CType(ByVal u As GDLURL) As String - Return u.URL - End Operator - Private Function CreateFromArray(ByVal ParamsArray() As String) As Object Implements IRegExCreator.CreateFromArray - If ParamsArray.ListExists(2) Then - Dim u$ = ParamsArray(0).StringTrim.StringTrimEnd("/"), u2$ - If Not u.IsEmptyString Then - u2 = ParamsArray(1).StringTrim - If Not u2.IsEmptyString AndAlso u2.StartsWith("GET", StringComparison.OrdinalIgnoreCase) Then - u2 = u2.Remove(0, 3).StringTrim.StringTrimStart("/") - If Not u2.IsEmptyString Then _URL = $"{u}/{u2}" - End If - End If - End If - Return Me - End Function - Public Shared Operator =(ByVal x As GDLURL, ByVal y As GDLURL) As Boolean - Return x.URL = y.URL - End Operator - Public Shared Operator <>(ByVal x As GDLURL, ByVal y As GDLURL) As Boolean - Return Not x.URL = y.URL - End Operator - Public Overrides Function ToString() As String - Return URL - End Function - Public Overrides Function Equals(ByVal Obj As Object) As Boolean - Return URL = CType(Obj, String) - End Function - End Structure - Private ReadOnly Property GdlUrlPattern As RParams = RParams.DM(GDLBatch.UrlLibStart.Replace("[", "\[").Replace("]", "\]") & - "([^""]+?)""(GET [^""]+)""", 0, EDP.ReturnValue) - Friend Function GetUrlsFromGalleryDl(ByVal Batch As BatchExecutor, ByVal Command As String) As List(Of String) - Dim urls As New List(Of String) - Dim u As GDLURL - With Batch - .Execute(Command) - If .ErrorOutputData.Count > 0 Then - For Each eValue$ In .ErrorOutputData - u = RegexFields(Of GDLURL)(eValue, {GdlUrlPattern}, {1, 2}, EDP.ReturnValue).ListIfNothing.FirstOrDefault - If Not u.URL.IsEmptyString Then urls.ListAddValue(u, LNC) - Next - End If - End With - Return urls - End Function - End Module Friend Class GDLBatch : Inherits TokenBatch Friend Const UrlLibStart As String = "[urllib3.connectionpool][debug]" Friend Const UrlTextStart As String = UrlLibStart & " https" diff --git a/SCrawler/API/Base/IUserData.vb b/SCrawler/API/Base/IUserData.vb index 758571b..e72e224 100644 --- a/SCrawler/API/Base/IUserData.vb +++ b/SCrawler/API/Base/IUserData.vb @@ -18,6 +18,7 @@ Namespace API.Base End Enum ReadOnly Property Site As String ReadOnly Property Name As String + Property NameTrue As String Property ID As String Property Options As String Property FriendlyName As String diff --git a/SCrawler/API/Base/SiteSettingsBase.vb b/SCrawler/API/Base/SiteSettingsBase.vb index c1448b1..91f236d 100644 --- a/SCrawler/API/Base/SiteSettingsBase.vb +++ b/SCrawler/API/Base/SiteSettingsBase.vb @@ -33,7 +33,7 @@ Namespace API.Base End Property Friend Property AccountName As String Implements ISiteSettings.AccountName Friend Property Temporary As Boolean = False Implements ISiteSettings.Temporary - Friend Property DefaultInstance As ISiteSettings = Nothing Implements ISiteSettings.DefaultInstance + Friend Overridable Property DefaultInstance As ISiteSettings = Nothing Implements ISiteSettings.DefaultInstance Protected _UserAgentDefault As String = String.Empty Friend Overridable Property UserAgentDefault As String Implements ISiteSettings.UserAgentDefault Get @@ -55,6 +55,11 @@ Namespace API.Base Friend Overridable ReadOnly Property Responser As Responser Private _UserOptionsExists As Boolean = False Private _UserOptionsType As Type = Nothing + Protected Overridable Function UserOptionsValid(ByVal Options As Object) As Boolean + Return True + End Function + Protected Overridable Sub UserOptionsSetParameters(ByRef Options As Object) + End Sub Protected Property UserOptionsType As Type Get Return _UserOptionsType @@ -243,7 +248,7 @@ Namespace API.Base #Region "User info" Protected UrlPatternUser As String = String.Empty Friend Overridable Function GetUserUrl(ByVal User As IPluginContentProvider) As String Implements ISiteSettings.GetUserUrl - If Not UrlPatternUser.IsEmptyString Then Return String.Format(UrlPatternUser, User.Name) + If Not UrlPatternUser.IsEmptyString Then Return String.Format(UrlPatternUser, User.NameTrue.IfNullOrEmpty(User.Name)) Return String.Empty End Function Private Function ISiteSettings_GetUserPostUrl(ByVal User As IPluginContentProvider, ByVal Media As IUserMedia) As String Implements ISiteSettings.GetUserPostUrl @@ -380,11 +385,40 @@ Namespace API.Base End Sub Friend Overridable Sub UserOptions(ByRef Options As Object, ByVal OpenForm As Boolean) Implements ISiteSettings.UserOptions If _UserOptionsExists Then - If Options Is Nothing OrElse Not Options.GetType Is _UserOptionsType Then - Options = AConvert(Me, AModes.Var, _UserOptionsType,, True, Nothing) + If Options Is Nothing OrElse (Not Options.GetType Is _UserOptionsType OrElse Not UserOptionsValid(Options)) Then + Dim args% = 0 + Dim constructor As ConstructorInfo = Nothing + With _UserOptionsType.GetTypeInfo.DeclaredConstructors + If .ListExists Then + With .Where(Function(ByVal c As ConstructorInfo) As Boolean + With c.GetParameters + If .ListExists Then + If .Count = 1 Then + Return .Self()(0).ParameterType.GetInterfaces.ListIfNothing.Where(Function(i) i Is Me.GetType).Count = 1 + Else + Return False + End If + Else + Return True + End If + End With + Return If(c.GetParameters?.Count, 0).ValueBetween(0, 1) + End Function) + If .ListExists Then + args = .Max(Of Integer)(Function(c) If(c.GetParameters?.Count, 0)) + constructor = .First(Function(c) If(c.GetParameters?.Count, 0) = args) + End If + End With + End If + End With + If Not constructor Is Nothing Then + If args > 0 AndAlso Not constructor.GetParameters()(0).ParameterType Is GetType(ISiteSettings) Then Throw New Exception + If args = 0 Then Options = constructor.Invoke(Nothing) Else Options = constructor.Invoke({Me}) + End If If Options Is Nothing Then Options = Activator.CreateInstance(_UserOptionsType) + If Not Options Is Nothing Then UserOptionsSetParameters(Options) End If - If OpenForm Then + If Not Options Is Nothing And OpenForm Then Using f As New InternalSettingsForm(Options, Me, False) : f.ShowDialog() : End Using End If Else diff --git a/SCrawler/API/Base/UserDataBase.vb b/SCrawler/API/Base/UserDataBase.vb index f2545ba..b6f3ab3 100644 --- a/SCrawler/API/Base/UserDataBase.vb +++ b/SCrawler/API/Base/UserDataBase.vb @@ -178,6 +178,8 @@ Namespace API.Base #Region "Additional names" Protected Const Name_SiteMode As String = "SiteMode" Protected Const Name_TrueName As String = "TrueName" + 'TODELETE Name_TrueName2 + Protected Const Name_TrueName2 As String = "NameTrue" Protected Const Name_Arguments As String = "Arguments" #End Region #End Region @@ -278,6 +280,21 @@ Namespace API.Base Return User.Name End Get End Property + Private _NameTrue As String = String.Empty + Friend Overridable Overloads Property NameTrue As String Implements IUserData.NameTrue, IPluginContentProvider.NameTrue + Get + Return NameTrue(False) + End Get + Set(ByVal NewName As String) + If Not _NameTrue = NewName Then EnvirChanged(NewName) + _NameTrue = NewName + End Set + End Property + Friend Overloads ReadOnly Property NameTrue(ByVal Exact As Boolean) As String + Get + Return If(Exact, _NameTrue, _NameTrue.IfNullOrEmpty(Name)) + End Get + End Property Friend Overridable Property ID As String = String.Empty Implements IUserData.ID, IPluginContentProvider.ID Protected _FriendlyName As String = String.Empty Friend Overridable Property FriendlyName As String Implements IUserData.FriendlyName @@ -348,12 +365,20 @@ Namespace API.Base Protected Function UserDescriptionNeedToUpdate() As Boolean Return (UserDescription.IsEmptyString Or _DescriptionEveryTime) And Not _DescriptionChecked End Function - Protected Sub UserDescriptionUpdate(ByVal Descr As String) - If UserDescriptionNeedToUpdate() Then + Protected Sub UserDescriptionUpdate(ByVal Descr As String, Optional ByVal Force As Boolean = False, + Optional ByVal InsertFirst As Boolean = False, Optional ByVal AppendDate As Boolean = False) + If UserDescriptionNeedToUpdate() Or Force Then + If AppendDate Then Descr = $"{Now.ToStringDateDef}: {Descr}" If UserDescription.IsEmptyString Then UserDescription = Descr + _ForceSaveUserInfo = True ElseIf Not UserDescription.Contains(Descr) Then - UserDescription &= $"{vbNewLine}----{vbNewLine}{Descr}" + If InsertFirst Then + UserDescription = $"{Descr}{vbNewLine}{UserDescription}" + Else + UserDescription &= $"{vbNewLine}----{vbNewLine}{Descr}" + End If + _ForceSaveUserInfo = True End If _DescriptionChecked = True End If @@ -907,6 +932,10 @@ BlockNullPicture: FileExists = True Using x As New XmlFile(MyFileSettings) With {.XmlReadOnly = True} If User.Name.IsEmptyString Then User.Name = x.Value(Name_UserName) + _NameTrue = x.Value(Name_TrueName) +#Disable Warning BC40008 + If _NameTrue.IsEmptyString AndAlso x.Contains(Name_TrueName2) Then _NameTrue = x.Value(Name_TrueName2) +#Enable Warning UserExists = x.Value(Name_UserExists).FromXML(Of Boolean)(True) UserSuspended = x.Value(Name_UserSuspended).FromXML(Of Boolean)(False) ID = x.Value(Name_UserID) @@ -967,6 +996,7 @@ BlockNullPicture: x.Add(Name_Plugin, HOST.Key) x.Add(Name_AccountName, AccountName) x.Add(Name_UserName, User.Name) + x.Add(Name_TrueName, _NameTrue) x.Add(Name_Model_User, CInt(UserModel)) x.Add(Name_Model_Collection, CInt(CollectionModel)) x.Add(Name_SpecialPath, User.SpecialPath) @@ -1162,6 +1192,7 @@ BlockNullPicture: Select Case Caller Case NameOf(UserExists) : If Not _EnvirUserExists = CBool(NewValue) Then _EnvirChanged = True : _EnvirInvokeUserUpdated = True Case NameOf(UserSuspended) : If Not _EnvirUserSuspended = CBool(NewValue) Then _EnvirChanged = True : _EnvirInvokeUserUpdated = True + Case NameOf(NameTrue) : _EnvirChanged = True : _EnvirInvokeUserUpdated = True : _ForceSaveUserInfo = True : _ForceSaveUserInfoOnException = True Case Else : _EnvirChanged = True End Select End If @@ -1282,9 +1313,9 @@ BlockNullPicture: UpdateUserInformation_Ex() If Not exit_ex.Silent Then If exit_ex.SimpleLogLine Then - MyMainLOG = $"{ToStringForLog()}: downloading interrupted (exit) ({exit_ex.Message})" + LogError(Nothing, $"downloading interrupted (exit) ({exit_ex.Message})") Else - ErrorsDescriber.Execute(EDP.SendToLog, exit_ex, $"{ToStringForLog()}: downloading interrupted (exit)") + LogError(exit_ex, "downloading interrupted (exit)") End If End If If _EnvirInvokeUserUpdated Then OnUserUpdated() @@ -1846,6 +1877,31 @@ BlockNullPicture: Protected Overridable Function CreateFileFromUrl(ByVal URL As String) As SFile Return New SFile(URL) End Function + Protected Overridable Function SimpleDownloadAvatar(ByVal ImageAddress As String, Optional ByVal FileCreateFunc As Func(Of String, SFile) = Nothing, + Optional ByVal e As ErrorsDescriber = Nothing) As SFile + Try + If Not ImageAddress.IsEmptyString Then + Dim f As SFile + If FileCreateFunc Is Nothing Then + f = CreateFileFromUrl(ImageAddress) + Else + f = FileCreateFunc.Invoke(ImageAddress) + End If + If Not f.Name.IsEmptyString Then f.Name = f.Name.StringRemoveWinForbiddenSymbols.StringTrim + If Not f.Name.IsEmptyString Then + f.Path = DownloadContentDefault_GetRootDir() + f.Separator = "\" + If f.Extension.IsEmptyString Then f.Extension = "jpg" + If Not f.Exists Then GetWebFile(ImageAddress, f, EDP.ReturnValue) + If f.Exists Then IconBannerDownloaded = True : Return f + End If + End If + Return Nothing + Catch ex As Exception + If Not e.Exists Then e = New ErrorsDescriber(EDP.ReturnValue) + Return ErrorsDescriber.Execute(e, ex, $"SimpleDownloadAvatar({ImageAddress})", New SFile) + End Try + End Function Protected Overridable Function ChangeFileNameByProvider(ByVal f As SFile, ByVal m As UserMedia) As SFile Dim ff As SFile = Nothing Try diff --git a/SCrawler/API/BaseObjects/InternalSettingsForm.vb b/SCrawler/API/BaseObjects/InternalSettingsForm.vb index 4c96cac..dcad003 100644 --- a/SCrawler/API/BaseObjects/InternalSettingsForm.vb +++ b/SCrawler/API/BaseObjects/InternalSettingsForm.vb @@ -134,6 +134,7 @@ Namespace API.Base m.GetMemberCustomAttributes(Of Provider).ListExists Dim m1 As MemberInfo, m2 As MemberInfo Dim tmpObj As Object + Dim maxOffset% members = GetObjectMembers(MyObject, Function(m) (m.MemberType = MemberTypes.Field Or m.MemberType = MemberTypes.Property) AndAlso Not m.GetCustomAttribute(Of PSettingAttribute) Is Nothing,, True, @@ -175,6 +176,9 @@ Namespace API.Base If MyMembers.Count > 0 Then + maxOffset = MyMembers.Max(Function(mm) mm.LeftOffset) + If maxOffset > 0 Then MyMembers.ForEach(Sub(mm) mm.LeftOffset = maxOffset) + Dim prov As IEnumerable(Of Provider) Dim _prov As Provider Dim si% = -1 diff --git a/SCrawler/API/Bluesky/Declarations.vb b/SCrawler/API/Bluesky/Declarations.vb new file mode 100644 index 0000000..66b1b6a --- /dev/null +++ b/SCrawler/API/Bluesky/Declarations.vb @@ -0,0 +1,18 @@ +' Copyright (C) Andy https://github.com/AAndyProgram +' This program is free software: you can redistribute it and/or modify +' it under the terms of the GNU General Public License as published by +' the Free Software Foundation, either version 3 of the License, or +' (at your option) any later version. +' +' This program is distributed in the hope that it will be useful, +' but WITHOUT ANY WARRANTY +Imports PersonalUtilities.Functions.RegularExpressions +Namespace API.Bluesky + Friend Module Declarations + Friend Const BlueskySiteKey As String = "AndyProgram_Bluesky" + Friend ReadOnly DateProvider As New ADateTime("yyyy-MM-ddTHH:mm:ss.FFF%K") + Friend ReadOnly RegEx_PlayLists As RParams = RParams.DM("RESOLUTION=\d+x(\d+)\s*(\S+)", 0, RegexReturn.List, EDP.ReturnValue) + Friend ReadOnly RegEx_FilePattern As RParams = RParams.DM("(.+?)(\.|@)(gif|m3u8|[^/\?\&]+)", 0, RegexReturn.ListByMatch, EDP.ReturnValue) + Friend ReadOnly RegEx_SinglePostPattern As RParams = RParams.DM("profile/([^/]+)/post/([^/\?\&]+)", 0, RegexReturn.ListByMatch, EDP.ReturnValue) + End Module +End Namespace \ No newline at end of file diff --git a/SCrawler/API/Bluesky/M3U8.vb b/SCrawler/API/Bluesky/M3U8.vb new file mode 100644 index 0000000..45d6200 --- /dev/null +++ b/SCrawler/API/Bluesky/M3U8.vb @@ -0,0 +1,51 @@ +' Copyright (C) Andy https://github.com/AAndyProgram +' This program is free software: you can redistribute it and/or modify +' it under the terms of the GNU General Public License as published by +' the Free Software Foundation, either version 3 of the License, or +' (at your option) any later version. +' +' This program is distributed in the hope that it will be useful, +' but WITHOUT ANY WARRANTY +Imports System.Threading +Imports SCrawler.API.Base +Imports PersonalUtilities.Forms.Toolbars +Imports PersonalUtilities.Tools.Web.Clients +Imports PersonalUtilities.Functions.RegularExpressions +Namespace API.Bluesky + Friend NotInheritable Class M3U8 + Private Sub New() + End Sub + Private Shared Function GetUrlsList(ByVal URL As String) As List(Of String) + Using resp As New Responser With {.AllowAutoRedirect = False} + Dim r$ = resp.GetResponse(URL) + If Not r.IsEmptyString Then + Dim file$ = String.Empty, appender$ + Dim files As List(Of Sizes) = RegexFields(Of Sizes)(r, {RegEx_PlayLists}, {1, 2}) + If files.ListExists Then files.RemoveAll(Function(ff) ff.Value = 0 Or ff.Data.IsEmptyString) + If files.ListExists Then + files.Sort() + file = files(0).Data + appender = URL.Replace(URL.Split("/").Last, String.Empty) + file = M3U8Base.CreateUrl(appender, file) + If Not file.IsEmptyString Then + r = resp.GetResponse(file) + If Not r.IsEmptyString Then + Dim l As List(Of String) = RegexReplace(r, M3U8Declarations.TsFilesRegEx) + If l.ListExists Then + appender = file.Replace(file.Split("/").Last, String.Empty) + For i% = 0 To l.Count - 1 : l(i) = M3U8Base.CreateUrl(appender, l(i)) : Next + Return l + End If + End If + End If + End If + End If + End Using + Return Nothing + End Function + Friend Shared Function Download(ByVal URL As String, ByVal Destination As SFile, ByVal Token As CancellationToken, + ByVal Progress As MyProgress, ByVal UsePreProgress As Boolean) As SFile + Return M3U8Base.Download(GetUrlsList(URL), Destination,, Token, Progress, UsePreProgress) + End Function + End Class +End Namespace \ No newline at end of file diff --git a/SCrawler/API/Bluesky/SiteSettings.vb b/SCrawler/API/Bluesky/SiteSettings.vb new file mode 100644 index 0000000..679a4cd --- /dev/null +++ b/SCrawler/API/Bluesky/SiteSettings.vb @@ -0,0 +1,100 @@ +' Copyright (C) Andy https://github.com/AAndyProgram +' This program is free software: you can redistribute it and/or modify +' it under the terms of the GNU General Public License as published by +' the Free Software Foundation, either version 3 of the License, or +' (at your option) any later version. +' +' This program is distributed in the hope that it will be useful, +' but WITHOUT ANY WARRANTY +Imports SCrawler.API.Base +Imports SCrawler.Plugin +Imports SCrawler.Plugin.Attributes +Imports PersonalUtilities.Functions.XML +Imports PersonalUtilities.Functions.RegularExpressions +Imports PersonalUtilities.Tools.Web.Clients +Imports PersonalUtilities.Tools.Web.Documents.JSON +Namespace API.Bluesky + + Friend Class SiteSettings : Inherits SiteSettingsBase + + Friend ReadOnly Property CookiesEnabled As PropertyValue + + Friend ReadOnly Property UserHandle As PropertyValue + + Friend ReadOnly Property UserPassword As PropertyValue + Friend ReadOnly Property Token As PropertyValue + Friend ReadOnly Property TokenUpdateTime As PropertyValue + + Friend ReadOnly Property TokenRefreshInterval As PropertyValue + Friend Sub New(ByVal AccName As String, ByVal Temp As Boolean) + MyBase.New("Bluesky", "bsky.app", AccName, Temp, My.Resources.SiteResources.BlueskyIcon_32, My.Resources.SiteResources.BlueskyPic_32) + + Responser.ContentType = "application/json" + + CookiesEnabled = New PropertyValue(False) + UserHandle = New PropertyValue(String.Empty, GetType(String)) + UserPassword = New PropertyValue(String.Empty, GetType(String)) + Token = New PropertyValue(String.Empty, GetType(String)) + TokenUpdateTime = New PropertyValue(Now.AddYears(-1)) + TokenRefreshInterval = New PropertyValue(120) + + _AllowUserAgentUpdate = False + UrlPatternUser = "https://bsky.app/profile/{0}" + ImageVideoContains = "bsky.app" + UserRegex = RParams.DMS("bsky.app/profile/([^/\?]+)", 1, EDP.ReturnValue) + UserOptionsType = GetType(EditorExchangeOptionsBase) + End Sub + Protected Overrides Function UserOptionsValid(ByVal Options As Object) As Boolean + Return DirectCast(Options, EditorExchangeOptionsBase).SiteKey = BlueskySiteKey + End Function + Protected Overrides Sub UserOptionsSetParameters(ByRef Options As Object) + DirectCast(Options, EditorExchangeOptionsBase).SiteKey = BlueskySiteKey + End Sub + Friend Overrides Function GetInstance(ByVal What As ISiteSettings.Download) As IPluginContentProvider + Return New UserData + End Function + Friend Overrides Function BaseAuthExists() As Boolean + Return Not CStr(UserHandle.Value).IsEmptyString And Not CStr(UserPassword.Value).IsEmptyString + End Function + Friend Overrides Function Available(ByVal What As ISiteSettings.Download, ByVal Silent As Boolean) As Boolean + Return MyBase.Available(What, Silent) AndAlso UpdateToken() + End Function + Private _TokenUpdating As Boolean = False + Friend Function UpdateToken(Optional ByVal Force As Boolean = False) As Boolean + Try + While _TokenUpdating : Threading.Thread.Sleep(100) : End While + _TokenUpdating = True + If BaseAuthExists() Then + If CDate(TokenUpdateTime.Value).AddMinutes(TokenRefreshInterval.Value) < Now Or Force Then + Using resp As Responser = Responser.Copy + With resp + .Mode = Responser.Modes.Curl + .Method = "POST" + .CurlSslNoRevoke = True + .CurlInsecure = True + .CurlArgumentsLeft = "-d ""{\" & $"""identifier\"": \""{UserHandle.Value}\"", \""password\"": \""{UserPassword.Value}\""" & "}""" + + Dim r$ = .GetResponse("https://bsky.social/xrpc/com.atproto.server.createSession") + If Not r.IsEmptyString Then + Using j As EContainer = JsonDocument.Parse(r, EDP.ReturnValue) + If j.ListExists Then + Dim t$ = j.Value("accessJwt") + If Not t.IsEmptyString Then Token.Value = $"Bearer {t}" : TokenUpdateTime.Value = Now : Return True + End If + End Using + End If + End With + End Using + Else + Return True + End If + End If + Return False + Catch ex As Exception + Return ErrorsDescriber.Execute(EDP.SendToLog + EDP.ReturnValue, ex, "Bluesky.SiteSettings.UpdateToken", False) + Finally + _TokenUpdating = False + End Try + End Function + End Class +End Namespace \ No newline at end of file diff --git a/SCrawler/API/Bluesky/UserData.vb b/SCrawler/API/Bluesky/UserData.vb new file mode 100644 index 0000000..2fcc8c9 --- /dev/null +++ b/SCrawler/API/Bluesky/UserData.vb @@ -0,0 +1,330 @@ +' Copyright (C) Andy https://github.com/AAndyProgram +' This program is free software: you can redistribute it and/or modify +' it under the terms of the GNU General Public License as published by +' the Free Software Foundation, either version 3 of the License, or +' (at your option) any later version. +' +' This program is distributed in the hope that it will be useful, +' but WITHOUT ANY WARRANTY +Imports System.Threading +Imports SCrawler.API.Base +Imports SCrawler.API.YouTube.Objects +Imports PersonalUtilities.Functions.XML +Imports PersonalUtilities.Functions.RegularExpressions +Imports PersonalUtilities.Tools.Web.Documents.JSON +Imports UTypes = SCrawler.API.Base.UserMedia.Types +Imports UStates = SCrawler.API.Base.UserMedia.States +Namespace API.Bluesky + Friend Class UserData : Inherits UserDataBase +#Region "Declarations" + Private ReadOnly Property MySettings As SiteSettings + Get + Return HOST.Source + End Get + End Property + Private ReadOnly Property ID_Encoded As String + Get + Return If(ID.IsEmptyString, String.Empty, SymbolsConverter.ASCII.EncodeSymbolsOnly(ID)) + End Get + End Property +#End Region +#Region "Loader" + Protected Overrides Sub LoadUserInformation_OptionalFields(ByRef Container As XmlFile, ByVal Loading As Boolean) + End Sub + Friend Overrides Function ExchangeOptionsGet() As Object + Return New EditorExchangeOptionsBase(Me) With {.SiteKey = BlueskySiteKey} + End Function + Friend Overrides Sub ExchangeOptionsSet(ByVal Obj As Object) + If Not Obj Is Nothing AndAlso TypeOf Obj Is EditorExchangeOptionsBase AndAlso + DirectCast(Obj, EditorExchangeOptionsBase).SiteKey = BlueskySiteKey Then NameTrue = DirectCast(Obj, EditorExchangeOptionsBase).UserName + End Sub +#End Region +#Region "Initializer" + Friend Sub New() + UseInternalM3U8Function = True + End Sub +#End Region +#Region "Token" + Private Function UpdateToken(Optional ByVal Force As Boolean = False, Optional ByVal OnlyAddHeader As Boolean = False) As Boolean + Dim process As Boolean = True + If CDate(MySettings.TokenUpdateTime.Value).AddHours(2) <= Now Or Force Then + process = MySettings.UpdateToken(Force) + If process Then _TokenUpdateCount += 1 + End If + If process Or OnlyAddHeader Then Responser.Headers.Add("authorization", MySettings.Token.Value) + Return Not Responser.Headers.Value("authorization").IsEmptyString + End Function + Private _TokenUpdateCount As Integer = 0 + Private Sub TokenUpdateCountReset() + _TokenUpdateCount = 0 + End Sub +#End Region +#Region "Download" + Private _PostCount As Integer = 0 + Protected Overrides Sub DownloadDataF(ByVal Token As CancellationToken) + If Not CBool(MySettings.CookiesEnabled.Value) Then Responser.Cookies.Clear() + UpdateToken(, True) + _TokenUpdateCount = 0 + _PostCount = 0 + DownloadData(String.Empty, Token) + End Sub + Private Overloads Sub DownloadData(ByVal Cursor As String, ByVal Token As CancellationToken) + Dim URL$ = String.Empty + Try + If ID.IsEmptyString Then GetProfileInfo(Token) + If ID.IsEmptyString Then Throw New ArgumentNullException("ID", "ID is null") + If UpdateToken() Then + Dim nextCursor$ = String.Empty + Dim c% + URL = $"https://bsky.social/xrpc/app.bsky.feed.getAuthorFeed?actor={ID_Encoded}&filter=posts_and_author_threads&includePins=false&limit=99" + If Not Cursor.IsEmptyString Then URL &= $"&cursor={SymbolsConverter.ASCII.EncodeSymbolsOnly(Cursor)}" + Dim r$ = Responser.GetResponse(URL) + TokenUpdateCountReset() + If Not r.IsEmptyString Then + Using j As EContainer = JsonDocument.Parse(r) + If j.ListExists Then + With j("feed") + If .ListExists Then + For Each post As EContainer In .Self + With post({"post"}) + c = DefaultParser(.Self,, nextCursor) + Select Case c + Case CInt(DateResult.Skip) * -1 : Continue For + Case CInt(DateResult.Exit) * -1 : Exit Sub + Case Is > 0 : _PostCount += c + End Select + If DownloadTopCount.HasValue AndAlso DownloadTopCount.Value <= _PostCount Then Exit Sub + End With + Next + End If + End With + End If + End Using + + If Not nextCursor.IsEmptyString Then DownloadData(nextCursor, Token) + End If + End If + Catch ex As Exception + ProcessException(ex, Token, $"DownloadData({URL})") + End Try + End Sub +#End Region +#Region "DefaultParser" + Private Const Down_ImageAddress As String = "https://cdn.bsky.app/img/feed_fullsize/plain/{0}/{1}" + Private Function GetPostID(ByVal PostUri As String) As String + Return If(PostUri.IsEmptyString, String.Empty, PostUri.Split("/").LastOrDefault) + End Function + Private Function DefaultParser(ByVal e As EContainer, Optional ByVal CheckDateLimits As Boolean = True, Optional ByRef NextCursor As String = Nothing, + Optional ByVal CheckTempPosts As Boolean = True, Optional ByVal State As UStates = UStates.Unknown) As Integer + Const exitReturn% = CInt(DateResult.Exit) * -1 + Dim postID$, postDate$, __url$, __urlBase$ + Dim updateUrl As Boolean + Dim c% = 0 + Dim m As UserMedia + Dim d As EContainer + With e + If .ListExists Then + postID = GetPostID(.Value("uri")) + postDate = String.Empty + __urlBase = String.Empty + With .Item({"record"}) + If .ListExists Then + '2025-01-28T02:42:12.415Z + postDate = .Value("createdAt") + NextCursor = postDate + If CheckDateLimits Then + Select Case CheckDatesLimit(postDate, DateProvider) + Case DateResult.Skip : Return CInt(DateResult.Skip) * -1 'Continue For + Case DateResult.Exit : Return exitReturn 'Exit Sub + End Select + End If + + If CheckTempPosts Then + If _TempPostsList.Contains(postID) Then Return exitReturn Else _TempPostsList.Add(postID) + End If + __urlBase = $"https://bsky.app/profile/{NameTrue}/post/{postID}" + End If + End With + + Dim createMedia As Func(Of String, UTypes, UserMedia) = + Function(ByVal url As String, ByVal type As UTypes) As UserMedia + m = New UserMedia(url, type) With { + .URL_BASE = __urlBase, + .File = CreateFileFromUrl(url, type), + .Post = New UserPost(postID, If(AConvert(Of Date)(postDate, DateProvider, Nothing, EDP.ReturnValue), Nothing)), + .State = State + } + _TempMediaList.ListAddValue(m, LNC) + c += 1 + Return m + End Function + + For Each SecondExtraction As Boolean In {False, True} + With If(SecondExtraction, .Item({"record", "embed"}), .Item("embed")) + If .ListExists Then + + If If(.Item("images")?.Count, 0) > 0 Then + With .Item("images") + For Each d In .Self + updateUrl = False + __url = d.Value("fullsize") + If __url.IsEmptyString Then __url = d.Value({"image", "ref"}, "$link") : updateUrl = True + If __url.IsEmptyString And SecondExtraction Then updateUrl = False : __url = e.Value({"embed"}, "thumb") + If Not __url.IsEmptyString Then createMedia(__url, UTypes.Picture) + Next + End With + End If + + If Not .Value("playlist").IsEmptyString Then createMedia(.Value("playlist"), UTypes.m3u8) + + If If(.Item("external")?.Count, 0) > 0 Then createMedia(.Value({"external"}, "uri"), UTypes.GIF) + + End If + End With + + If c > 0 Then Exit For + Next + End If + End With + Return c + End Function +#End Region +#Region "GetProfileInfo" + Private Sub GetProfileInfo(ByVal Token As CancellationToken) + Try + If UpdateToken() Then + Dim r$ = Responser.GetResponse($"https://bsky.social/xrpc/app.bsky.actor.getProfile?actor={ID.IfNullOrEmpty(NameTrue)}") + TokenUpdateCountReset() + If Not r.IsEmptyString Then + Using j As EContainer = JsonDocument.Parse(r) + If j.ListExists Then + ID = j.Value("did") + UserSiteNameUpdate(j.Value("displayName")) + UserDescriptionUpdate(j.Value("description")) + NameTrue = j.Value("handle") + SimpleDownloadAvatar(j.Value("avatar")) + SimpleDownloadAvatar(j.Value("banner")) + End If + End Using + End If + Else + Throw New ArgumentException("Token is null", "Token") + End If + Catch ex As Exception + ProcessException(ex, Token, "GetProfileInfo") + End Try + End Sub +#End Region +#Region "ReparseMissing" + Protected Overrides Sub ReparseMissing(ByVal Token As CancellationToken) + Const uriPattern$ = "at://{0}/app.bsky.feed.post/{1}" + Dim rList As New List(Of Integer) + Try + If ContentMissingExists AndAlso UpdateToken() Then + Dim r$, url$, uri$ + Dim tu As Byte + Dim m As UserMedia + Dim j As EContainer + For i% = 0 To _ContentList.Count - 1 + m = _ContentList(i) + If m.State = UStates.Missing Then + uri = SymbolsConverter.ASCII.EncodeSymbolsOnly(String.Format(uriPattern, NameTrue, m.Post.ID)) + url = $"https://bsky.social/xrpc/app.bsky.feed.getPostThread?uri={uri}&depth=10" + For tu = 0 To 1 + Try + Responser.ResetStatus() + r = Responser.GetResponse(url) + TokenUpdateCountReset() + If Not r.IsEmptyString Then + j = JsonDocument.Parse(r) + If j.ListExists Then + If DefaultParser(j({"thread", "post"}), False,, False, UStates.Missing) > 0 Then rList.Add(i) + j.Dispose() + End If + End If + Exit For + Catch eex As Exception + If ProcessException(eex, Token, $"ReparseMissing({url})",,, False) <> 1 Then Throw eex + End Try + Next + End If + Next + Else + Throw New ArgumentException("Token is null", "Token") + End If + Catch ex As Exception + ProcessException(ex, Token, "ReparseMissing error") + Finally + If rList.Count > 0 Then + For i% = rList.Count - 1 To 0 Step -1 : _ContentList.RemoveAt(rList(i)) : Next + rList.Clear() + End If + End Try + End Sub +#End Region +#Region "CreateFileFromUrl" + Protected Overloads Overrides Function CreateFileFromUrl(ByVal URL As String) As SFile + Return CreateFileFromUrl(URL, UTypes.Undefined) + End Function + Protected Overloads Function CreateFileFromUrl(ByVal URL As String, ByVal Type As UTypes) As SFile + Dim f As SFile = MyBase.CreateFileFromUrl(URL) + Dim force As Boolean = False + f.Separator = "\" + With URL.Split("/") + If .ListExists Then + With DirectCast(RegexReplace(.Last, RegEx_FilePattern), List(Of String)) + If .ListExists(4) Then + f.Name = .Item(1).IfNullOrEmpty(f.Name) + f.Extension = .Item(3) + End If + End With + End If + End With + If Not f.Extension.IsEmptyString AndAlso f.Extension.ToLower = "m3u8" Then force = True : Type = UTypes.m3u8 + If f.Extension.IsEmptyString Or force Then + Select Case Type + Case UTypes.Picture : f.Extension = "jpg" + Case UTypes.GIF : f.Extension = "gif" + Case UTypes.m3u8 : f.Name = "Video" : f.Extension = "mp4" + End Select + End If + Return f + End Function +#End Region +#Region "DownloadContent" + Protected Overrides Sub DownloadContent(ByVal Token As CancellationToken) + DownloadContentDefault(Token) + End Sub + Protected Overrides Function DownloadM3U8(ByVal URL As String, ByVal Media As UserMedia, ByVal DestinationFile As SFile, ByVal Token As CancellationToken) As SFile + Return M3U8.Download(URL, DestinationFile, Token, Progress, Not IsSingleObjectDownload) + End Function +#End Region +#Region "DownloadSingleObject" + Protected Overrides Sub DownloadSingleObject_GetPosts(ByVal Data As IYouTubeMediaContainer, ByVal Token As CancellationToken) + _TokenUpdateCount = 0 + UpdateToken() + Dim l As List(Of String) = RegexReplace(Data.URL, RegEx_SinglePostPattern) + If l.ListExists(3) Then + NameTrue = l(1) + _ContentList.Add(New UserMedia(Data.URL) With {.State = UStates.Missing, .Post = l(2)}) + ReparseMissing(Token) + End If + MyBase.DownloadSingleObject_GetPosts(Data, Token) + End Sub +#End Region +#Region "Exception" + Protected Overrides Function DownloadingException(ByVal ex As Exception, ByVal Message As String, Optional ByVal FromPE As Boolean = False, + Optional ByVal EObj As Object = Nothing) As Integer + If Responser.StatusCode = Net.HttpStatusCode.BadRequest Then '400 + If _TokenUpdateCount = 0 AndAlso UpdateToken(True) Then + Return 1 + Else + Return 0 + End If + Else + Return 0 + End If + End Function +#End Region + End Class +End Namespace \ No newline at end of file diff --git a/SCrawler/API/Facebook/Declarations.vb b/SCrawler/API/Facebook/Declarations.vb index fbeaf13..8d80a15 100644 --- a/SCrawler/API/Facebook/Declarations.vb +++ b/SCrawler/API/Facebook/Declarations.vb @@ -18,6 +18,8 @@ Namespace API.Facebook Friend ReadOnly Regex_FileName As RParams = RParams.DM("([^/\?]+\..{3,4})(?=(\?|\Z))", 0, EDP.ReturnValue) Friend ReadOnly Regex_ProfileUrlID As RParams = RParams.DMS("profile.php\?id=(\d+)", 1, EDP.ReturnValue) Friend ReadOnly Regex_VideoPageID As RParams = RParams.DMS("pageid.:.(\d+)", 1, RegexOptions.IgnoreCase, EDP.ReturnValue) + Friend ReadOnly Regex_ReelsPageID As RParams = RParams.DMS("\{[^\}]*""tab_key"":""owner_reels"",?[^\}]*""id"":""([^\}""]+)""", 1, RegexOptions.IgnoreCase, EDP.ReturnValue) + Friend ReadOnly Regex_ReelsFilePattern As RParams = RParams.DM("[^/]+\.mp4", 0, EDP.ReturnValue) Friend ReadOnly Regex_StoryBucket As RParams = RParams.DMS("story_bucket[^\>]*?(\d+)", 1, EDP.ReturnValue) Friend ReadOnly Regex_VideoIDFromURL As RParams = RParams.DMS("facebook.com/([^/]+/videos/|watch/\D*[\?&]{1}v=)(\d+)", 2, EDP.ReturnValue) diff --git a/SCrawler/API/Facebook/SiteSettings.vb b/SCrawler/API/Facebook/SiteSettings.vb index f246609..b443cc0 100644 --- a/SCrawler/API/Facebook/SiteSettings.vb +++ b/SCrawler/API/Facebook/SiteSettings.vb @@ -36,6 +36,8 @@ Namespace API.Facebook Friend ReadOnly Property ParsePhotoBlock As PropertyValue Friend ReadOnly Property ParseVideoBlock As PropertyValue + + Friend ReadOnly Property ParseReelsBlock As PropertyValue Friend ReadOnly Property ParseStoriesBlock As PropertyValue #End Region @@ -52,6 +54,7 @@ Namespace API.Facebook Header_Accept = New PropertyValue(String.Empty, GetType(String)) ParsePhotoBlock = New PropertyValue(True) ParseVideoBlock = New PropertyValue(True) + ParseReelsBlock = New PropertyValue(False) ParseStoriesBlock = New PropertyValue(True) UrlPatternUser = "https://www.facebook.com/{0}" diff --git a/SCrawler/API/Facebook/UserData.vb b/SCrawler/API/Facebook/UserData.vb index 5c5eff0..d0a204f 100644 --- a/SCrawler/API/Facebook/UserData.vb +++ b/SCrawler/API/Facebook/UserData.vb @@ -23,9 +23,11 @@ Namespace API.Facebook Private Const Name_IsNoNameProfile As String = "IsNoNameProfile" Private Const Name_OptionsParsed As String = "OptionsParsed" Private Const Name_VideoPageID As String = "VideoPageID" + Private Const Name_ReelsPageID As String = "ReelsPageID" Private Const Name_StoryBucket As String = "StoryBucket" Private Const Name_ParsePhotoBlock As String = "ParsePhotoBlock" Private Const Name_ParseVideoBlock As String = "ParseVideoBlock" + Private Const Name_ParseReelsBlock As String = "ParseReelsBlock" Private Const Name_ParseStoriesBlock As String = "ParseStoriesBlock" #End Region #Region "Declarations" @@ -37,15 +39,18 @@ Namespace API.Facebook Private IsNoNameProfile As Boolean = False Private OptionsParsed As Boolean = False Private Property VideoPageID As String = String.Empty + Private Property ReelsPageID As String = String.Empty Private Property StoryBucket As String = String.Empty Friend Property ParsePhotoBlock As Boolean = True Friend Property ParseVideoBlock As Boolean = True + Friend Property ParseReelsBlock As Boolean = False Friend Property ParseStoriesBlock As Boolean = True Private Enum PageBlock As Integer Timeline = Sections.Timeline Stories = Sections.Stories Photos = 100 Videos = 101 + Reels = Sections.Reels Undefined = -1 End Enum #End Region @@ -67,6 +72,7 @@ Namespace API.Facebook With DirectCast(Obj, UserExchangeOptions) ParsePhotoBlock = .ParsePhotoBlock ParseVideoBlock = .ParseVideoBlock + ParseReelsBlock = .ParseReelsBlock ParseStoriesBlock = .ParseStoriesBlock End With End If @@ -90,18 +96,22 @@ Namespace API.Facebook End If OptionsParsed = .Value(Name_OptionsParsed).FromXML(Of Boolean)(False) VideoPageID = .Value(Name_VideoPageID) + ReelsPageID = .Value(Name_ReelsPageID) StoryBucket = .Value(Name_StoryBucket) ParsePhotoBlock = .Value(Name_ParsePhotoBlock).FromXML(Of Boolean)(True) ParseVideoBlock = .Value(Name_ParseVideoBlock).FromXML(Of Boolean)(True) + ParseReelsBlock = .Value(Name_ParseReelsBlock).FromXML(Of Boolean)(False) ParseStoriesBlock = .Value(Name_ParseStoriesBlock).FromXML(Of Boolean)(True) Else updateNames.Invoke .Add(Name_IsNoNameProfile, IsNoNameProfile.BoolToInteger) .Add(Name_OptionsParsed, OptionsParsed.BoolToInteger) .Add(Name_VideoPageID, VideoPageID) + .Add(Name_ReelsPageID, ReelsPageID) .Add(Name_StoryBucket, StoryBucket) .Add(Name_ParsePhotoBlock, ParsePhotoBlock.BoolToInteger) .Add(Name_ParseVideoBlock, ParseVideoBlock.BoolToInteger) + .Add(Name_ParseReelsBlock, ParseReelsBlock.BoolToInteger) .Add(Name_ParseStoriesBlock, ParseStoriesBlock.BoolToInteger) End If End With @@ -146,6 +156,7 @@ Namespace API.Facebook Else If DownloadImages And ParsePhotoBlock Then DownloadData_Photo(String.Empty, Token) If DownloadVideos And ParseVideoBlock Then DownloadData_Video(String.Empty, Token) + If DownloadVideos And ParseReelsBlock Then DownloadData_Reels(String.Empty, Token) If (DownloadImages Or DownloadVideos) And ParseStoriesBlock Then DownloadData_Stories(Token) End If LoadSavePostsKV(False) @@ -158,10 +169,12 @@ Namespace API.Facebook Private Const Header_fb_fr_name_Video As String = "PagesCometChannelTabAllVideosCardImplPaginationQuery" Private Const Header_fb_fr_name_Stories As String = "StoriesSuspenseContentPaneRootWithEntryPointQuery" Private Const Header_fb_fr_name_SavedPosts As String = "CometSaveDashboardAllItemsPaginationQuery" + Private Const Header_fb_fr_name_Reels As String = "ProfileCometAppCollectionReelsRendererPaginationQuery" Private Const DocID_Photo As String = "6684543058255697" Private Const DocID_Video As String = "24545934291687581" Private Const DocID_Stories As String = "6771064226315961" Private Const DocID_SavedPosts As String = "7112228098805003" + Private Const DocID_Reels As String = "28517740954539304" Private Const Graphql_UrlPattern As String = "https://www.facebook.com/api/graphql?lsd={0}&doc_id={1}&server_timestamps=true&fb_dtsg={3}&fb_api_req_friendly_name={2}&variables={4}" Private Const VideoHtmlUrlPattern As String = "https://www.facebook.com/watch/?v={0}" Private Sub DownloadData_Photo(ByVal Cursor As String, ByVal Token As CancellationToken) @@ -238,7 +251,7 @@ Namespace API.Facebook Dim newPostsDetected As Boolean = False Dim pid As PostKV - If VideoPageID.IsEmptyString Then GetVideoPageID(Token) + If VideoPageID.IsEmptyString Then GetVideoPageID(False, Token) If VideoPageID.IsEmptyString Then Throw New TokensException("Unable to obtain 'VideoPageID'", False) ValidateBaseTokens() @@ -355,6 +368,123 @@ Namespace API.Facebook ProcessException(ex, Token, $"data (stories) downloading error [{URL}]",, Responser) End Try End Sub + Private Sub DownloadData_Reels(ByVal Cursor As String, ByVal Token As CancellationToken) + Dim URL$ = String.Empty + Const VarPattern$ = """count"":10,""cursor"":{0},""feedLocation"":""COMET_MEDIA_VIEWER"",""feedbackSource"":65,""focusCommentID"":null,""renderLocation"":null,""scale"":1,""useDefaultActor"":true,""id"":""{1}"",""__relay_internal__pv__FBReelsMediaFooter_comet_enable_reels_ads_gkrelayprovider"":true,""__relay_internal__pv__IsWorkUserrelayprovider"":false" + Try + Dim nextCursor$ = String.Empty + Dim newPostsDetected As Boolean = False + Dim nodeFound As Boolean = False + Dim pid As PostKV = Nothing + Dim __urlBase$ = String.Empty + Dim lines As List(Of String) + Dim j As EContainer, rr As EContainer + Dim jDataRoot As EContainer = Nothing + Dim indx% = -1 + Dim s As New List(Of Sizes) + Dim videoIdNode$() = {"profile_reel_node", "node", "video", "id"} + + Dim obtainBasePostData As Action = Sub() + If indx.ValueBetween(0, jDataRoot.Count - 1) Then + With jDataRoot(indx) + pid = New PostKV(String.Empty, .Item(videoIdNode).XmlIfNothingValue. + IfNullOrEmpty(.Value({"node"}, "id")), PageBlock.Reels) + pid.Code = $"Reels:{pid.ID}" + nextCursor = .Value("cursor") + If Not .Item(videoIdNode).XmlIfNothing.IsEmptyString Then + __urlBase = $"https://www.facebook.com/reel/{pid.ID}" + Else + __urlBase = String.Empty + End If + End With + Else + pid = Nothing + nextCursor = String.Empty + __urlBase = String.Empty + End If + End Sub + Dim createFile As Func(Of String, SFile, SFile) = Function(ByVal __url As String, ByVal cFile As SFile) As SFile + Dim f As New SFile(RegexReplace(__url, Regex_ReelsFilePattern)) + If Not f.IsEmptyString Then Return f Else Return cFile + End Function + + If ReelsPageID.IsEmptyString Then GetVideoPageID(True, Token) + If ReelsPageID.IsEmptyString Then Throw New TokensException("Unable to obtain 'ReelsPageID'", False) + ValidateBaseTokens() + + URL = String.Format(Graphql_UrlPattern, Token_lsd, DocID_Reels, Header_fb_fr_name_Reels, Token_dtsg_Var, + SymbolsConverter.ASCII.EncodeSymbolsOnly("{" & String.Format(VarPattern, If(Cursor.IsEmptyString, "null", $"""{Cursor}"""), ReelsPageID) & "}")) + + ResponserApplyDefs(Header_fb_fr_name_Reels) + ThrowAny(Token) + + WaitTimer() + Dim r$ = Responser.GetResponse(URL) + If Not r.IsEmptyString Then + lines = r.StringToList(Of String)(vbCrLf).ListIfNothing + If lines.ListExists Then + For Each line$ In lines + j = JsonDocument.Parse(line, EDP.ReturnValue) + If j.ListExists Then + jDataRoot = j({"data", "node", "aggregated_fb_shorts", "edges"}) + If jDataRoot.ListExists Then + + With j({"extensions", "all_video_dash_prefetch_representations"}) + If .ListExists Then + ProgressPre.ChangeMax(.Count) + For indx = 0 To .Count - 1 + ProgressPre.Perform() + + obtainBasePostData() + If Not pid.ID.IsEmptyString AndAlso Not PostKvExists(pid) Then + newPostsDetected = True + PostsKVIDs.ListAddValue(pid, LNC) + _TempPostsList.Add(pid.Code) + + With .ItemF({indx, "representations"}) + If .ListExists Then + s.Clear() + For Each rr In .Self : s.Add(New Sizes(rr.Value("width"), rr.Value("base_url"))) : Next + If s.Count > 0 Then s.RemoveAll(Function(ss) ss.Value = 0 Or ss.Data.IsEmptyString) + If s.Count > 0 Then + s.Sort() + _TempMediaList.ListAddValue(New UserMedia(s(0).Data, UTypes.Video) With { + .URL_BASE = __urlBase.IfNullOrEmpty(.URL_BASE), + .Post = pid.ID, + .File = createFile(s(0).Data, .File), + .SpecialFolder = "Reels*" + }, LNC) + s.Clear() + End If + End If + End With + + + If Limit > 0 And _TempMediaList.Count >= Limit Then j.Dispose() : Exit Sub + Else + j.Dispose() + Exit Sub + End If + + Next + End If + End With + + End If + + j.Dispose() + End If + Next + End If + End If + + If newPostsDetected And Not nextCursor.IsEmptyString Then DownloadData_Reels(nextCursor, Token) + Catch tex As TokensException When Not tex.BasicTokens + TokensException.SendToLog(Me, tex, "data (reels)") + Catch ex As Exception + ProcessException(ex, Token, $"data (reels) downloading error [{URL}]",, Responser) + End Try + End Sub Private Sub DownloadData_SavedPosts(ByVal Cursor As String, ByVal Token As CancellationToken) Dim URL$ = String.Empty Const VarPattern$ = """content_filter"":[],""count"":10,""cursor"":{0},""scale"":1,""use_case"":""SAVE_DEFAULT""" @@ -507,13 +637,19 @@ Namespace API.Facebook Return True End If End Function - Private Sub GetVideoPageID(ByVal Token As CancellationToken) - Dim URL$ = $"{GetProfileUrl()}\videos" + Private Sub GetVideoPageID(ByVal GetReels As Boolean, ByVal Token As CancellationToken) + Dim URL$ = $"{GetProfileUrl()}\{IIf(GetReels, "reels", "videos")}" Dim resp As Responser = HtmlResponserCreate() Try WaitTimer() Dim r$ = resp.GetResponse(URL) - If Not r.IsEmptyString Then VideoPageID = RegexReplace(r, Regex_VideoPageID) + If Not r.IsEmptyString Then + If GetReels Then + ReelsPageID = RegexReplace(r, Regex_ReelsPageID) + Else + VideoPageID = RegexReplace(r, Regex_VideoPageID) + End If + End If Catch ex As Exception ProcessException(ex, Token, "get video page ID",, resp) Finally @@ -658,17 +794,39 @@ Namespace API.Facebook HtmlResponserDispose(resp) End Try End Sub + Private Structure VideoResolution : Implements IComparable(Of VideoResolution) + Friend W As Integer + Friend H As Integer + Friend B As Integer + Friend U As String + Friend ReadOnly Property Wrong As Boolean + Get + Return W = 0 Or H = 0 Or B = 0 Or U.IsEmptyString + End Get + End Property + Private Function CompareTo(ByVal Other As VideoResolution) As Integer Implements IComparable(Of VideoResolution).CompareTo + Return CLng(Math.Max(W, H) * B).CompareTo(CLng(Math.Max(Other.W, Other.H) * Other.B)) * -1 + End Function + End Structure Protected Function ReparseSingleVideo(ByVal m As UserMedia, ByVal resp As Responser, ByRef result As Boolean) As UserMedia Const nameSD$ = "browser_native_sd_url" Const nameHD$ = "browser_native_hd_url" + Const nameDPR$ = "all_video_dash_prefetch_representations" Const pattern$ = "