Big Data and Knowledge Graph Based Fault Diagnosis for Electric Power Systems

Fault detection plays an important role in the daily maintenance of power electric system. Big data and knowledge graph (KG) have been proposed by researchers to solve many problems in industrial Internet of Things, which also give lots of potentials in improving the performance of fault detection for electric power systems. In particular, this paper analyzes a distributed knowledge graph framework for fault detection in the electric power systems, where multiple devices train their local detection models used for fault detection assisted with a central server. Each device owns its local data set composed of historical fault information and current device state, which can be used to train a local model for fault detection. To enhance the detection performance, the distributed devices interact with each other in the KG framework, where the devices ought to achieve the regional computation in addition to the model aggregation within a specified latency threshold. Through searching for the vibrant qualities together with determined ability at the devices, we enhance the knowledge graph framework by the optimum variety of energetic devices together with the restriction of latency as well as data transmission. Particularly, two data transmission frequency allocation (FA) schemes are developed for the distributed knowledge graph framework, through which scheme I is actually bared after the instantaneous device state information (DSI), and scheme II utilizes particle swarm optimization (PSO) technique along with the statistical DSI. The results of simulation on the examination as well as convergence are lastly demonstrated to show the advantages of the proposed distributed KG framework in the fault detection for the electric power systems.


Introduction
At present, China is in a critical period of integration of industrialization and industrialization.Power informatization and intelligence are the inevitable product of "Internet + power" [1][2][3].Electric power is the foundation of a country's development [4,5].In recent years, smart grid has occupied an important position in the economic field.Due to the wide distribution of smart grid, climate, natural disasters and other reasons, it will lead to power failure, even continuous power failure, resulting in the paralysis of smart grid, which not only has a serious impact on people's life, but also causes irreparable losses to enterprises and even the whole national economy [6][7][8].Therefore, a reliable and accurate power grid fault diagnosis system is of great significance to find fault equipment, diagnose fault causes and eliminate faults in time.
Monitoring system and data acquisition system have long been applied in the early development of power grid.It can feed back the voltage or current change and other electrical quantity information of each device in the power grid in real time through monitoring equipment [9][10][11], which also provides data guarantee for engineers in power grid fault diagnosis in case of power grid fault [12][13][14].However, the power grid is a dynamic system with complex structure and operation mode, and the causes of faults are diverse.Sometimes the fault signal is not directly related to the causes of faults, which poses a great obstacle to the accurate diagnosis of power grid faults.
The rapid development of big data and artificial intelligence brings opportunities for intelligent diagnosis of power faults.The traditional manual troubleshooting of power grid faults not only consumes a lot of manpower, but also the reliability and accuracy can not be guaranteed, and the location and cause of faults can not be found in time [15][16][17][18].The application of big data mining technology and knowledge graph technology in the field of artificial intelligence to power diagnosis can realize real-time monitoring, prediction and early warning analysis of power grid, shorten the time of troubleshooting, and greatly improve the fault detection accuracy of troubleshooting power grid faults.
By using the newly arising technology of big data and knowledge graph [19][20][21], this paper analyzes a distributed knowledge graph framework for fault detection in the electric power systems, where multiple devices train their local detection models used for fault detection assisted with a central server.Each device owns its local data set composed of historical fault information and current device state, which can be used to train a local model for fault detection.To enhance the detection performance, the distributed devices interact with each other in the KG framework, where the devices ought to achieve the regional computation in addition to the model aggregation within a specified latency threshold.Through searching for the vibrant qualities together with determined ability at the devices, we enhance the knowledge graph framework by the optimum variety of energetic devices together with the restriction of latency as well as data transmission.Particularly, two data transmission frequency allocation (FA) schemes are developed for the distributed knowledge graph framework, through which scheme I is actually bared after the instantaneous device state information (DSI), and scheme II utilizes particle swarm optimization (PSO) technique along with the statistical DSI.The results of simulation on the examination as well as convergence are lastly demonstrated to show the advantages of the proposed distributed KG framework in the fault detection for the electric power systems.
The remainder of this particular specific paper is actually orderly as observes.After the intro, Section II discusses the distributed knowledge graph framework under the restriction of the latency for the electric power systems.Afterward, two frequency allocation (FA) methods based upon DSI are actually to enhance the system fault detection accuracy provided in Section III.Additional, simulation outcomes are presented in Section IV.In reality, the device dataset is constrained , and it is not cozy for each device to achieve an excellent model trained by itself, instead of avoid using other devices' datasets.Whereas, straightly utilizing the datasets of other devices will put a severe burden on the transmission and calculation, leading to a serious problem for information leakage.In order to address these questions, the distributed knowledge graph framework is applied in Fig. 1, in which each device merely considers to transmit the parameters trained by model to the server E, which can decrease the transmission energy consumption considerably.

Big data and knowledge graph based fault diagnosis framework
Firstly, the central server E arbitrarily chooses M devices among I ones for effectiveness as well as justness at every episode [22].Then, the central server dispatches the whole model (WM) to the all.After that, the device improves the local model (LM) depended on the global model.These M devices further submit the LMs to the main server E, with the aim at speeding up LMs as well as the WM.This procedure is actually repeated R rounds, unless a penalty deeper system design is actually acquired.
Particularly, at rounded r, the M-th device gets the worldwide design of the previous rounded v r−1 .Generally, the down link data rate is actually much bigger compared to the uplink data rate, as well as the server has actually a larger transfer power compared to the device due to the link along with the energy source.Thus, for device D i , the moment to get the worldwide design specification could be minimal.After ξ i opportunities of the regional rounds, device D i improves the regional setting via utilizing LM design as well as regional dataset, as well as the regional design is actually upgraded as where v t i represents device D i in addition to A i (•) means the decrease function for device D i .Notation ϱ in addition to ∇ mean the understanding cost in addition to gradient treatment on the decrease function, particularly.The local informing chance of device D i is actually really t e i , offered with in which υ is actually the range of CPU cycles computing one instance info in addition to ξ i is actually really the epoch of local informing of device I in a transmission round.Additionally, a i ∼ D(a min , a max ) is actually the computational ability of device D i , where D(•) means a outfit flow, whereby a min in addition to a max are actually really the very little in addition to optimal computational abilities, particularly.After the informing, the local style is actually really sent to the primary internet server, in addition to the transmission chance of device D i is actually really offered with where K i is actually really the measurement of the style spec of device D i , whereby the 32 bits wandering element design is actually really used online, in addition to Q i is actually really the transmission cost originating between device D i and the internet server E, offered with where P i is actually really the move power in addition to ϑ 2 i is actually really the variance of the additive Gaussian white colored tinted noise (AWGN) at the recipient.The system spec g i expertises Rayleigh degree fading together with the common system enhance of ς i , in addition to O i is actually really the cord-free system information move designated to device D i .Online, the regularity variety is actually really limited, in addition to the system information move of the I devices should please the adhering to restriction, where O total represent the total wireless bandwidth.
The internet server E extra aggregates the obtained style specs in addition to obtains the around the world style.Our group use the federated common (FedAvg), as well as afterward the around the world style spec v r at pivoted R might be improved as Coming from the over equations, we can easily draw the conclusion of the overall delay of device D i at each R as

Optimization of distributed knowledge graph framework
To accelerate the worldwide aggregation, a latency limit β th ought to be actually collection.Particularly, device D i has the ability to finish the submit if its own delay is actually listed below β th , i.e., This equation indicates that the device i will be dropped offline if the latency is above β th .In particular, a smaller β th will increase the probability that the device i is dropped from the global training.Through integrating this latency restriction, we can easily enhance the distributed knowledge graph framework with allocating the restricted regularity range amongst M devices, so as to reduce the worldwide reduction work, min subject to t i ≤ β th , ∀i ∈ I and i∈I O i ≤ O total , where I 1 is actually really the range of energised devices, that can easily quickly efficiently send the styles.In the adhering to, we will uses 2 FA strategies in the optimization problem in (10).

Analysis of DSI
As the device state information affects the system performance of distributed knowledge graph significantly, it is of vital important to analyze the DSI associated with the devices, which can be further used for the device fault detection.As to the following DSI, we firstly analyze the expectation, which reflects the long-term value during a long time for the device.The expectation of Q i is given by Note that the PDF of |g i | 2 is given by By applying f |g i | 2 (x) into (13), we can re-write the expectation of Q i as By using the theory of partial integral, we can further obtain the analytical expectation of Q i as, where Ei(•) denotes the integral exponential function [24].Besides the above expectation analysis, we can also analyze the device outage performance, for a given rate Q th .The outage probability of device i, P out,i , is given by, where the outage threshold γ th is given by By applying the PDF of |g i | 2 , i.e., 1 (20), we can obtain the analytical outage of device i in the fault detection as,

I-DSI located FA scheme
As the I-DSI M devices is actually estimated in addition to acquired at each pivoted of federated understanding, our group can easily quickly refix the FA management problem in (10) according to the immediate DSI.
Information that the I-DSI situated FA strategy might be place on the cord-free systems together with repaired fading, like the demand circumstances of repaired Internet of Factors (IoT) bodies.To inform the around the world style a lot a great deal much a lot better, more devices should take part into sending styles in each transmission pivoted.Meant with this, our group recommend a organizing FA strategy, in which the devices together with advantageous system issues have actually the propensity to become really designated an appropriate information move preferentially.Especially, our group to start with type the I devices in a boiling down acquisition inning conformity along with the system issue, which kinds a compilation I .In I , the previous devices have really a lot a great deal much a lot better system issue compared with the final.Later, the devices in I are actually really selected each other originating from the extremely preliminary to the last, as well as afterward they are actually really designated an appropriate information move, therefore regarding please the latency need.Especially, for device D i , its own very personal information move allotment should please the adhering to need, which results in if certainly there certainly suffices data transfer source left behind.This procedure proceeds up till the devices in I have actually been actually totally assigned or even the data transfer source was utilized up.

S-DSI Based FA management
When the over I-DSI located FA scheme needs to understand the I-DSI of all of devices at every rounded, a serious concern is actually enforced on the system application.To reduce this concern, we rely on make use of the stats DSI to refix the FA issue in (10).Details that the S-DSI needs to be actually utilized to the wireless networks, like the request situations of Web of Cars systems.Coming from (23), we first of all compose the network problems along with the provided delay limit β th as where H(y) is When the networks in the system go through Rayleigh fading, |g i | 2 observes the rapid circulation along with the typical increase of ς i .Within this particular situation, we rely on make the most of the assumption of variety of energetic devices taking part into submitting designs to the server.Intended through this, we first of all determine the possibility that each device can easily effectively submit its own design to the server, which pleases the latency demand.Coming from the possibility thickness work of |g i | 2 , we can easily acquire the conditional assumption of the I-th device taking part into submitting its own design to the server as, Coming from E(y i |a i ) as well as a i ∼ U (a min , a max ), we could easily compose the assumption E(y i ) as where the Gaussian-Chebyshev (GC) estimation [24] is actually utilized as well as l is actually a complexity-vsaccuracy trade off specification along with Details that the GC estimation could be precise along with a tool worth of k.Coming from E(y i ), we can easily determine the assumption of the variety of energetic devices that can easily effectively submit design specifications, provided through The FA scheme is actually after that developed to make the most of the assumption E(Y ), max When it is actually challenging to straight refix the optimization issue in (29), we rely on utilize the PSO formula to refix the optimization, which is actually a smart formula based upon populace.In the PSO formula, certainly there certainly are actually I bits in the populace, as well as each bit consists of 2 essential associates of setting as well as speed.we utilize p J as well as w J to stand for the setting as well as speed of bit I, specifically, from which p J = {O 1 , O 2 , . . ., O i } offers a possible service of the data transfer allotment issue in (18) as well as v J = {∆O 1 , ∆O 2 , . . ., ∆O i } stands for the increment of p J .Right below, ∆O i is actually the increment of O i coming from the present version to the following one.Furthermore, pb i as well as gb are actually utilized to denote the very best FA services of bit I as well as the worldwide bits up till the present version, specifically, which are actually determined due to the physical health and fitness work.Right below, the physical health and fitness work of the PSO is actually defined through E(Y ).At version J, the speed of the I-th bit, w J i , is actually upgraded through where q 1 as well as q 2 are actually 2 velocity constants, ϕ 1 as well as ϕ 2 are actually 2 arbitrary variables consistently dispersed in the variety of [0, 1], as well as π means the inertia value element.Coming from (30), the setting p J J is actually upgraded through 5 EAI Endorsed Transactions on Industrial Networks and Intelligent Systems Online First Yuzhong Zhou et al.
The bits need j opportunities of version to upgrade their own speed as well as setting inning accordance with (30) as well as (31), specifically.After l epochs, the gb would certainly be acquired amongst Y bits.In further, for Y bits as well as l epochs in the PSO formula, the connected calculated intricacy has to do with O(l × j), in which the efficiency of the PSO could be enhanced along with enhanced varieties of bits as well as iterations.

Fault Diagnosis Results and Discussions
Within this particular Section, this paper performs some simulations to validate the proposed studies on the big data and knowledge graph based fault diagnosis in the distributed knowledge graph framework, where Python 3.6 is actually utilized as well as the knowing structure is actually PyTorch 1.8.0.Particularly, the overall interaction rounded is actually readied to 500, the overall variety of devices is actually 200, as well as the variety of chosen devices is actually readied to 10. Otherwise defined, the system overall interaction data transfer is actually readied to 50MHz, as well as the networks in the system expertise Rayleigh level fading, where the typical network increase of the I-device to the server is actually readied to ς i = (j + 50)/200, instead of reduction of generality.Furthermore, the transfer energy of devices is actually readied to 0.6 W. In additional, the calculated ability of devices is actually consistently dispersed in the variety of [2.1 × 106 , 3.1 × 10 6 ] pattern/2nd.Additionally, for the PSO methods including in the S-DSI located FA scheme, the variety of bit in the populace is actually 30, where the variety of iterations is actually twenty.Furthermore, both velocity constants q 1 as well as q 2 are actually each readied to 0.4 as well as π = 0.5.Additionally, two common datasets of MNIST as well as Fashion-MNIST (FMN) are actually utilized to educate the designs to verify the research researches.
Table I-IV show the fault detection accuracy of various FA schemes versus the interaction rounded, where Table I as well as Table II represent FMN along with β th = 5.1s as well as MNIST along with β th = 3.1s.We can easily observe coming from these 2 numbers that the precision as well as reduction of a number of FA schemes end up being convergent along with the enhancing variety of interaction rounded.In additional, the I-DSI located FA strategy is actually above the S-DSI located one, as well as it can easily accomplish practically the exact very same fault detection accuracy as the standard one, because the instant network condition info is actually efficiently made use of to assist enhance the FA procedure.
Table V reveals the impact of the delay limit β th on the fault detection accuracy of the FA schemes, where the data set FMN is actually utilized as well as β th differs coming from twos to 10s.Coming from Table III, we can easily discover that the efficiencies of the 2 FA schemes as well as UA enhance along with a bigger β th , as the devices can easily effectively submit the designs to the server more simpler.Furthermore, the I-DSI located FA scheme can easily accomplish practically the exact very same fault detection accuracy as the standard one, as the possibility that devices can easily effectively submit design is actually decreased seriously in the reduced Section of β th .These phenomena validate the 2 FA schemes.
Table VI depicts the effect of overall interaction data transfer on the examination precision of the FA schemes, where the data set FMN is actually utilized, β th is actually fives, as well as the data transfer O total differs.The examination precision of the several strategies enhances along with the enhancing worth of O total , as the gear box price ends up being bigger as well as appropriately the gear box latency reduces.Furthermore, the I-DSI located FA scheme can easily accomplish practically the exact very same fault detection accuracy.Besides, compared with the I-DSI FA scheme, the S-DSI located one weakens a lot more quickly along with the lower sized data transfer, because it ends up being harder for the devices to finish the design submit because of the reduced gear box information price in the S-DSI located FA scheme.These reasons additional show the fault detection accuracy of the 2 FA schemes.
Table VII-VIII show the effect of latency limit as well as overall interaction data transfer on the fault detection accuracy of various FA schemes, where the data set uses MNIST.Particularly, Table VII is associated with the fault detection accuracy versus the delay limit along with O total = 55MHz, while Table VIII is associated wtih the fault detection accuracy versus the interaction data transfer along with β th = 3s.From these two tables, we can easily discover that the fault detection accuracy of both schemes is better than that of the UA one, and the improvement enhances along with a bigger O total or even β th , as the devices have actually more chances to finish the regional educating as well as design submit.General, the phenomena in Table VII-VIII shows the fault detection accuracy of the proposed distributed knowledge graph framework with the proposed FA schemes.

Conclusions
This paper analyzed a distributed knowledge graph framework for fault detection in the electric power systems, where multiple devices trained their local detection models used for fault detection assisted with a central server.Each device owned its local data set composed of historical fault information and current device state, which could be used to train a local model for fault detection.To enhance the detection performance, the distributed devices interacted with each other in the KG framework, where the devices should achieve the regional computation in addition to the model aggregation within a specified latency threshold.Particularly, two data transmission FA schemes were developed for the distributed knowledge graph framework, through which scheme I was actually bared after the instantaneous device state information (DSI), and scheme II utilized particle swarm optimization (PSO) technique along with the statistical DSI.The results of simulation on the examination as well as convergence were lastly demonstrated to show the advantages of the proposed distributed KG framework in the fault detection for the electric power systems.

Fig. 1
Fig. 1 reveals the big data and knowledge graph based fault diagnosis for electric power systems in the distributed knowledge graph framework, where I devices {D i |1 ≤ i ≤ I} attempt to speed up their training models for local fault detection, under the assist of a centralized server E. Each device D i owns a data set B i , composed of historical fault information and current device state.Such data set can be used to train a local model for fault detection, as well as samples in B i is |B I |.The overall number of samples of the fault detection dataset is |B| = I i=1 |B i |.In reality, the device dataset is constrained , and it is not cozy for each device to achieve an excellent model trained by itself, instead of avoid using other devices' datasets.Whereas, straightly utilizing the datasets of other devices will put a severe burden on the transmission and calculation, leading to a serious problem for information leakage.In order to address these questions, the distributed knowledge graph framework is applied in Fig.1, in which each device merely considers to transmit the parameters trained by model to the server E, which can decrease the transmission energy consumption considerably.Firstly, the central server E arbitrarily chooses M devices among I ones for effectiveness as well as justness at every episode[22].Then, the central server dispatches the whole model (WM) to the all.After that, the device improves the local model (LM) depended on the global model.These M devices further submit the LMs to the main server E, with the aim at speeding up LMs as well as the WM.This procedure is actually repeated R rounds, unless a penalty deeper system design is actually acquired.
subject to t i ≤ β th , ∀i ∈ I and i∈I O i ≤ O total , where I is actually really the compilation of I devices, in addition to A(•) represents the around the world decrease function.Since tasks related to data sets may use various decrease functions, the optimization problem in (9) is actually really definitely certainly not fundamental.Affected with that sending more styles efficiently in each pivoted can easily quickly help improve the distributed knowledge graph framework effectiveness[23], our group depend on improve the distributed knowledge graph framework with maximizing the range of energised devices that can easily quickly efficiently send the styles, offered with max{O 1 ,...,O M } M 1 , (10)3 EAI Endorsed Transactions on Industrial Networks and Intelligent Systems Online FirstYuzhong Zhou et al.
Big data and knowledge graph based fault diagnosis framework for the electric power systems.

Table 6 .
Effect of overall interaction data transfer on the FA schemes along with Fashion-MNIST.

Table 7 .
Effect of latency limit as well as overall interaction data transfer on the FA schemes along with MNIST.

Table 8 .
Effect of latency limit as well as overall interaction data transfer on the FA schemes along with MNIST.