38 2576-3180/20/$25.00 © 2020 IEEE IEEE Internet of Things Magazine • March 2020
Motahareh Mobasheri, Yangwoo Kim, and Woongsup Kim
Toward developing Fog decision Making
on The TransMission raTe oF various ioT
devices Based on reinForceMenT learning
During the last few years, because of the emergence of the
Internet of Things (IoT) and then some of its consequent trends
such as smart cities, cloud-based infrastructures have become
inefficient solutions due to their centralized computing model.
Because of the continuously increasing number of IoT devices, besides managing them, cloud limitations such as latency
and network bandwidth require more attention, especially in
the case of emergencies such as car crashes in a smart city in
which every millisecond is very important to prevent damage.
Moreover, because of bandwidth limitations, it is impossible to
transfer all of the IoT devices’ data to the cloud. Every moment,
IoT devices produce a considerable amount of data that is not
economically feasible to transfer to the cloud. These problems
result in information loss and subsequent incorrect decision
These challenges have led to the need for a distributed
intelligent platform at the edge of the network. Fog computing extends the cloud services to the network edge, bringing
computation, communication, and storage closer to end users
with certain latency, mobility, bandwidth, security, and privacy
constraints. By using analytical algorithms in the fog layer, we
can preprocess IoT data and only send higher-level events to
the center (the cloud). Besides the above mentioned problems,
sometimes, decisions about the IoT devices need to be made in
real time. By using the intelligence in the network edge, edge
devices can make decisions quicker with machine learning
With the advent of fog computing, IoT management has
become a popular research area. In , a software defined
networking (SDN)-based approach was proposed for managing IoT by changing the existing structure and decoupling the
control plane from the data plane. In , the authors focused
on the relationship between the edge and the cloud, and proposed an approach to manage the requirements of low-latency
and bandwidth-intensive applications. In , the authors solved
the problem of the video traffic volume generated by IoT-based
multimedia devices by transmitting prioritized frames for a
video sequence and prioritized packets for an image, thereby
ensuring that the data transmission took less bandwidth. This
approach needs extra processing in the IoT devices for assigning priorities to data frames. To achieve efficiency in bandwidth
and power management, in , the authors eliminated the
redundant data gathered from different sensors. The goal of
 was to improve the users’ quality of experience (QoE) by
addressing the issue of load balancing in fog computing. In ,
a node called a broker was introduced to perform the scheduling among the users and the fog and cloud nodes. The system described in  dynamically and automatically determines
the processing tasks to be computed on a cloud server or its
defined local gateway device.
Although these papers focused on the communication of
end devices and fog nodes, they consider cloud nodes along
with fog nodes. Moreover, they do not consider emergency
situations, and their approaches are not based on machine
learning techniques for the devices’ bandwidth allocation. In
this study, we only focus on the fog node side and the corresponding IoT devices.
Since one of the IoT challenges is bandwidth consumption,
bandwidth management has become crucial in smart environments. As the number of devices increases, this challenge
becomes more important . While smart cities are getting
more advanced, and are equipped with various IoT devices
with exponential growth and different requirements, managing
them is not an easy problem. Moreover, because of various
smart environments and their unique situations, it is not suitable
to define rule tables manually and separately for each one’s IoT
devices and fog nodes’ operations.
By using an appropriate algorithm, we can reduce human
errors and eliminate time-consuming efforts for defining accurate rules, and set everything automatically and usable for all
smart environments without any assumption on the type of
networks and devices’ features. Reinforcement learning (RL),
as a powerful machine learning approach, does not need any
trainer or supervisor for solving problems and is suitable for the
mentioned purposes. RL is a learning process for mapping visited states to available and feasible actions in order to maximize
a received reward. The Q-learning algorithm is one of the most
common RL algorithms, learning optimal decisions by using
only the rewards received from the environment. An RL task
that satisfies the Markov property is called a Markov decision
process (MDP) . Starting from the current state of the RL
agent, Q-learning finds an optimal policy for any finite MDP,
In recent years, the focus on reducing the delay and the cost of transferring data to the cloud has led to data processing near end
devices. Therefore, fog computing has emerged as a powerful complement to the cloud to handle the large data volume belonging to the Internet of Things (IoT) and the requirements of communications. Over time, because of the increasing number of IoT
devices, managing them by a fog node has become more complicated. The problem addressed in this study is the transmission rate
of various IoT devices to a fog node in order to prevent delays in emergency cases. We formulate the decision making problem of
a fog node by using a reinforcement learning approach in a smart city as an example of a smart environment and then develop a
Qlearning algorithm to achieve efficient decisions for IoT transmission rates to the fog node. Although to the best of our knowledge,
thus far, there has been no research with this objective, in this study two more approaches, random-based and greedy-based, are
simulated to show that our method performs considerably better (over 99.8 percent) than these algorithms.
Digital Object Identifier: 10.1109/IOTM.0001.1900070
IEEE Internet of Things Magazine • March 2020 39
while maximizing the expected value of the total reward over
all successive steps .
The problem addressed in this study regards managing the
network bandwidth of a smart city in which several IoT devices
are connected to a single fog node. Because of the network’s
bandwidth limitation, if an emergency event happens in one of
the IoT devices’ area, the fog node will have to select an appropriate device and decrease its bandwidth in order to increase
the bandwidth of the needy device that is in an emergency.
The goal of this study is making the best decisions about
device selections for helping emergency devices on the basis
of RL with maximum performance. The fog node as a manager
and a learner agent should be aware of emergency devices
and help them by increasing their bandwidth. Therefore, the
fog node can receive more data, and then it can prepare better
reports for higher levels of the network structure. Since the
total bandwidth of the network is fixed, the fog node should
learn the best IoT device for decreasing its bandwidth and then
increasing the bandwidth of the current needy device.
The remainder of the article is organized as follows. The
research problem is defined in detail. We formulate the decision making problem and present the proposed method. Then
we describe the simulation of the proposed approach, followed
by our conclusions and suggestions for future works.
In this article, we consider the decision making problem of a
single fog node in emergency cases about transmission rates of
various IoT devices in a smart city as a smart environment. This
smart city’s network has a fixed and limited total bandwidth,
supports high video traffic, and includes several IoT devices
with predefined priorities connected to a single fog node to
which they send their data. The amount of the primary bandwidth that they are allowed to use for transferring data to the
fog node, their amount of required additional bandwidth in
sudden and emergency situations, and their priorities on the
basis of their locations are fixed and predefined by the smart
city management system.
We have considered a single type of camera device among
different categories of IoT devices in a smart city, which can
monitor a smart city with crowded main highways. Although
the camera devices can adjust their quality of capturing video, it
is not necessary to transfer high-quality videos to the fog node
in normal situations. In contrast, in emergencies, the fog node
should prepare the required extra bandwidth for the involved
cameras to receive sufficient data with better quality for further
analyses of events that have caused the emergency situations.
If a large event covered a wide area of the smart city and
numerous devices got into emergency situations, the low-priority
devices, placed at unimportant locations, are responsible to help
high-priority devices with their own bandwidth in order to prevent the smart city’s network failure. We have classified the IoT
devices connected to our scenario’s fog node into three types:
1. IoT devices with the lowest priority. When these devices have
sufficient bandwidth, they are always candidates for helping
emergency devices. Furthermore, these devices never meet
emergency situations on the basis of their locations in the
2. Main devices with the highest priority that can never help
emergency devices should always be in active mode.
3. IoT devices whose priorities are not the lowest or the highest. When they are not in emergency situations and have
sufficient required bandwidth, they can help the emergency
We have considered a feasible device set (helping candidate
set) that includes type 1 and 3 devices with normal situations
and enough available bandwidth for lending to the current
needy device by the amount that it needs.
In the proposed method, in order to achieve the above
mentioned objective, over time, the fog node acts as an RL
agent that has to learn the environment states based on the RL
approach . Then, on the basis of this learning process, the
fog node has to make future-oriented decisions. The fog node
has to select a device from the recently updated feasible device
set and decrease its bandwidth in order to increase the needy
device’s bandwidth for further communications. The goal of the
fog node is to select the best helper from the feasible device
set on the basis of the devices’ priorities. Selecting based on
priority means that whereas there is a helper with lower priority,
choosing a helper with higher priority is not efficient since the
higher assigned priority shows the importance of the device’s
location and higher probability of meeting an emergency situation.
Our fog node’s selection is future oriented; that is, the fog
node’s selections are such that in the future, the system will
meet the minimum emergency situations or the minimum number of devices with insufficient bandwidth due to helping others. The fog node makes these decisions while it does not have
any supervisor and information about the distribution parameters related to future emergencies.
For a better sense, imagine a situation in which the analysis
of the captured data belonging to a main highway shows an
unusual situation such as an accident. In this case, the fog node
needs more data with better quality for further decisions, so it
sends a request to the corresponding camera for increasing its
frequency of transferring data. However, as transferring more
data needs more bandwidth, the fog node selects an appropriate helper and reduces its bandwidth by the amount that
the emergency device needs. As the number of needy devices
increases, the assignment of extra bandwidth to several devices
becomes complicated and has an impressive effect on the situation, whereas every millisecond is really important to prevent
damage and failure.
Suppose that one or more main highways in a smart city
have suffered from a series of accidents because of heavy snow.
What will happen when several critical sensors get into emergency situations? If we use certain rules defined by a supervisor,
an unpredicted event may take place when there is no predicted instruction for it. Therefore, the best way is that the fog
node learns different situations and their best suitable actions.
In this approach, the fog node tries all the feasible actions one
by one in each visit of every state, learns the best actions during
its learning process, and then uses its best experiences for the
succeeding similar visits in the future without any supervisor.
Note that we have only focused on the bandwidth limitation and not on solving emergency events. The reason for asking an emergency device for more data is that the fog node
is responsible for sending reports with sufficient information
to the center for higher-level decisions. The question of what
actions should be performed to turn the smart city to a normal
situation is related to higher levels of the hierarchical management system, and we do not discuss it here.
In this study, to model the fog node’s decision making problem,
the state, the action, and the reward functions are defined as
• State: The state of time step n explains the situations of all
the connected devices to the fog node. flag is a Boolean
array containing nd elements for nd devices, where each one
shows the situation of a device according to its index. Every
IoT device experiences a normal situation (the 0 position) or
an emergency situation (the 1 position).
We have assumed that all the devices are adjusted to send
reports to the fog node in the predefined intervals, but as a
device understands an unusual situation during the captured
data processing, it has to inform the fog node. In every time
step, flag is updated on the basis of the received emergency
reports. The elements of flag that are related to the emergency devices via their indexes are changed to 1, and the others
40 IEEE Internet of Things Magazine • March 2020
remain in the 0 position. As soon as the emergency situation
ends, the related device has to inform the fog node to change
the related element of flag to 0.
• Action: If all the devices are in normal situations, there is no
need to select any action. When an emergency occurs, it is
necessary to perform an appropriate action for preparing
enough information for the center in order to prevent further
damage. At the beginning of each time step, the fog node
checks the updated flag to find what elements of flag are
recently changed to 1. If there is a new 1-value element, the
fog node starts making decisions.
For example, assume that an event occurs in the area of
device i (di). Therefore, the fog node has to select a device (dk)
from the feasible device set and assign the bandwidth of dk to
di as much as it needs. Finally, when di comes back to its normal situation, the fog node sets the related element of flag from
1 to 0 and then takes back the additional assigned bandwidth
from di to dk.
Based on the RL approach, with the probability of 1 – e the
fog node selects an action (a helper) randomly; otherwise, it
selects on the basis of past experiences, that is, it selects a helper with the highest value in the Q matrix . At the beginning
of the learning process, the probability of random action selection is higher than selecting an action on the basis of past experiences. As the fog node experiences strengthen over time, the
probability of random action selection decreases step by step.
• Reward function: As the fog node performs its selected
action, it receives a reward value that determines the quality
of the recently selected action. Therefore, when the fog node
assigns sufficient bandwidth to a needy device, it perceives
an increase in the received reward. In this study, the reward
function shows the number of needy devices that received
the required additional bandwidth plus the number of devices with normal situations, since the fog node operation is one
of the reasons for being in a normal situation.
As we have mentioned before, one issue that causes an
action selection to become optimal is the selection of a device
with the lowest priority among all the feasible helpers. Therefore, this constraint should affect the reward function. For this
purpose, the value of received reward is decreased by a punishment value that shows the number of helpers in the feasible device set whose priorities are less than the priority of the
selected helper for the current needy device. One more parameter, penalty, affects the reward function. Penalty decreases the
reward function only when there is no device with the helping
conditions because of the bad operation of the fog node in the
past. Moreover, the value of penalty should be sufficiently high
to be a good alarm for the fog node. Therefore, we assign the
number of all the devices with lower priorities than the priority
of the needy device to penalty.
As soon as the fog node has completed whatever it was supposed to do for each element of flag and received its reward, it
updates its Q matrix using the main equation of the Q-learning
algorithm , where a, g, and ai are the learning coefficient,
the constant discount factor, and the index of the device that
is selected to help device i, respectively. Moreover, Q-values
are held in the form of two matrixes (Qnew and Qold) because
of the updating procedure. The main algorithm and Q-learning
algorithm show the fog node operation as an RL agent.
Our smart city scenario includes a main highway with 20 IoT
devices (nd = 20) that all send video traffic to the fog node.
The predefined priority array is one of the inputs of the learning algorithm. For 20 devices, we have considered 6, 4, and
10 devices for type 1, 2, and 3, respectively. For simulating
emergency events during the fog node’s learning period, we
use the uniformly distributed pseudorandom integer function in
each time step to generate nd values randomly from 0, 1 for
nd devices held by the flag array showing all of the smart city’s
devices’ situations. When a device has 1-value in its related
index of flag, it has encountered an emergency and needs help.
Q matrix, flag array, and e are initialized to 0, 0, and n-0.015,
respectively. The scheduling update interval in whole implementations is based on time steps, and each time step is considered
as 1 s.
We have simulated two more algorithms to compare their
results with our Q-learning approach’s result. One of these
algorithms is called the greedy algorithm, since it selects the
best action in each time step based on the supervisory instructions for the current situation without considering future steps
or past experiences. When the fog node is under the orders
of a supervisor, it does not need to learn various states. The
supervisory instructions are provided in such a way that the fog
node automatically selects a feasible helper with the lowest priority. The other approach, called the random algorithm, selects
a device randomly as a helper and then checks whether it is
possible for this device to help the current needy device or not.
Then, if this device does not have helping conditions, the fog
node continues to select another device randomly, regardless
of the device’s priorities. As soon as the fog node reaches a
device with the helping conditions, it applies changes in the
bandwidths of the needy device and the helper.
Successful bandwidth management (SBM) is our target function for evaluating and comparing all three approaches in which
the number of helped devices and devices with normal situations are decreased by the value of countpun where countpun
calculates the total number of devices in the feasible device set
1. Initialize priorities, primary bandwidths, extra needed bandwidths in
emergencies, and current bandwidths of all IoT devices, flag, time step, and
2. While not converged do
1. if all flag’s elements were 0 in the last 2 steps do
2. GO TO 2/4
1. for each element i of flag do
1. if flag(i) is converted from 1 to 0 do
1. Find device u which has helped device i
2. Take back the borrowed bandwidth from i to u
2. elseif flag(i) is converted from 0 to 1 do
3. end if
2. end for
3. end if
4. calculate SBMn using countpun
5. Calculate the average of SBM
6. Make new flag for the next step
3. end while
Algorithm . Q-learning.
1. with probability e randomly choose device k among feasible helpers
2. otherwise ai = argmaxjfeasible (Q(i, j)) where j is the index of selected helper
to device i
3. if all devices were not able to help do
1. calculate penalty
1. calculate punish(i)
2. Increase the bandwidth of device i and decrease the bandwidth of device j,
as much as i needs
5. end if
6. Calculate RnQ based on punish(i), penalty, and the helped devices and normal
1. countpun = countpun + punish(i)
7. Calculate Q matrix:
1. Qnew = (1 – a)Qold (i, ai) + a (RnQ + g (maxkfeasible (Qold (i’, k’))))
IEEE Internet of Things Magazine • March 2020 41
with priorities less than the selected helpers’ priorities for all the
needy devices in time step n.
Because of the random procedure of the random algorithm,
it has the lowest average SBM. As the number of helpers with
lower priorities increases, the countpun value increases; therefore, the average SBM decreases. Moreover, the selection of
helpers on the basis of supervisory instructions is not optimal,
since it is impossible to consider all the probable events in the
variable fog node’s environment, and action selections are not
future oriented. By using the Q-value, the agent can choose the
best action on the basis of all the achievable rewards starting
from the current state (not just the immediate received reward).
This is the main motivation for the fog node to try all the feasible helpers in every state and see their results in order to gain
powerful experiences. Therefore, the final average SBM of the
Q-learning algorithm is better than that of the greedy algorithm.
In Fig. 1, all of these algorithms’ results are presented.
The red curve, the blue stars, and the green circles denote
the results of the Q-learning, greedy, and random algorithms,
respectively. Further, the vertical axis shows the average SBM,
and the horizontal axis denotes the time steps. As is obvious
from this figure, the Q-learning algorithm’s result converges
to the highest average value of SBM (19.98) among all the
presented results. The second highest average SBM is for the
greedy approach (18.01), and the last one is related to the
random approach (17.01). The performance is calculated via
a proportion equation. Considering 20 IoT devices, in the best
situation the maximum average of SBM is 20, which can be
considered as 100 percent performance; since average SBM =
20 means all of the emergency IoT devices have received help
or are in the normal situation.
Moreover, the total number of time steps of the Q-learning from the start of the learning process to the convergence
(with accuracy = 10–5) is 1614 s, while those of the greedy and
random algorithms are 1002 s and 1306 s, respectively. The
Q-learning algorithm needs to have a learning procedure; therefore, it obviously needs a longer time for converging. In most
cases, the total number of time steps of the random algorithm
is lower than that of Q-learning, since the fog node selects a
device as a helper randomly without any process or procedure.
Therefore, this strategy makes the random approach faster than
the Q-learning algorithm. Sometimes, the random algorithm is
very unlucky and has to perform the selection process several
times to find a device with all of the required helping conditions. This makes the random algorithm slower than the greedy
algorithm, and sometimes even slower than the Q-learning
To analyze the results of these algorithms in detail, we
focused on the beginning period of their operation. Figure 2
shows the first 100 s of Fig. 1.
The highest achievable SBM value is 20 (because nd = 20).
The initial average SBM values of all three algorithms are 0.
Immediately, the greedy algorithm reaches the highest average SBM value, while the others go toward 20 gradually, and
their curves are strict lines toward 20 because the number
of emergency events is low and the number of devices with
helping conditions is high. As time goes on and the number
of emergency events increases, the number of feasible helpers
decreases, and this decreases the average SBM values of all
three algorithms. As the learning process is in the beginning
stages, the fog node does not have sufficient expertise; thus, its
operation is not perfect. Therefore, the temporary descent of
the Q-learning’s average SBM is more than that of the others’.
Over time, the learning progresses, and the fog node strengthens its decisions on the basis of its gained experiences; therefore, its average SBM increases and finally exceeds that of the
two other algorithms.
After a while, because of the convergence of the fog node’s
learning process, the average SBM becomes stable. Once the
learning process is complete, the fog node can select the best
device among the helping candidates on the basis of its experiences and the received rewards. The average SBM does not
converge exactly to the highest value, but approximates it.
There are two reasons for this fact: 1) As the fog node’s enviFigure 1. Average SBM of the fog node based on the proposed
algorithm (Q-learning), and supervised and random algorithms.
Figure 2. Average SBM of the fog node in the first 100 s based on
Q-learning, supervised, and random algorithms.
Figure 3. Average SBM of the fog node in the first 300 s based
on Q-learning, greedy, and random algorithms for 80 IoT
Figure 4. Average SBM of the fog node based on Q-learning,
greedy, and random algorithms for 80 IoT devices with eight
more devices with highest priority than Fig. 3.
42 IEEE Internet of Things Magazine • March 2020
ronment is variable and not static, unpredicted situations may
occur; therefore, the fog node continues selecting random
actions with a low probability even after convergence. Moreover, as the fog node visits new states, it receives low rewards
at the beginning of its learning period. 2) At times, all the devices are in their normal situations; therefore, the fog node does
not need to select a helper, but it receives the highest reward
value, as this ideal situation is related to the past optimal operation of the fog node.
In the following, we examine all three algorithms with diff erent initializations. We increased the number of IoT devices by
60, so the smart city considered in Fig. 3 includes 80 cameras.
The results show that when the number of IoT devices increases, the fog node operation becomes worse in the cases of the
greedy and random algorithms, as managing higher numbers of
devices is more diffi cult. We continued evaluating the proposed
algorithm by changing the devices’ priorities besides increasing
their numbers. We changed the priorities of eight devices considered in Fig. 3 to the highest and plotted the corresponding results
in Fig. 4. Then we changed the devices’ priorities considered in
this fi gure to higher levels to make the deciding conditions more
diffi cult for the fog node. The corresponding results are shown
in Fig. 5. Obviously, the application of these changes makes
bandwidth management more diffi cult. The percentages of the
average SBM of all the above fi gures for the Q-learning, greedy,
and random methods are shown in Table 1.
With the emergence of big data and IoT, the use of manual
and supervisory instructions has become difficult, costly, and
time consuming. In contrast, the amount of data increases over
time, and situations change continuously. These factors increase
the error probability. Moreover, the environment changes
over time, and this results in permanent needs for changing
predefi ned instructions. The reinforcement learning approach
solves these problems without using any supervisors, while it
achieves better performance by reducing human errors.
In this study, we solve the bandwidth limitation problem of
a smart city as a smart environment in emergency situations
using the Q-learning algorithm as a popular RL technique. Then
we compare the Q-learning algorithm with the greedy and
random algorithms and show the preference of the proposed
approach from the point of view of performance. The proposed
algorithm gained over 99.8 percent performance after fi nishing
the learning period with diff erent initializations. The fog node
achieves this high performance without any supervisor, and all
the operations are executed according to its decisions based on
In the future, this problem will be investigated further by
adding several fog nodes to handle all the smart city zones in
the same scenario when all of the fog nodes have full cooperation, and the decisions of each one aff ect those of the others.
This research was supported by the MSIT (Ministry of Science,
ICT), Korea, under the ITRC (Information Technology Research
Center) support program (IITP-2019-2016-0-00465) supervised
by the IITP (Institute for Information & Communications Technology Planning & Evaluation).
 K. C. Okafor et al., “Leveraging Fog Computing for Scalable IoT Datacenter
Using SpineLeaf Network Topology,” J. Electrical and Computer Engineering
 Q. Wang et al., “Multimedia IoT Systems and Applications,” 2017 Global
Internet of Things Summit, June 2017. DOI: 10.1109/GIOTS.2017.8016221.
 P. K. Choubey et al., “Power Effi cient, Bandwidth Optimized and Fault Tolerant
Sensor Management for IOT in Smart Home,” 2015 IEEE Int’l. Advanced Computing Conf., June 2015, pp. 366–70.
 J. Oueis, E. C. Strinati, and S. Barbarossa, “The Fog Balancing: Load Distribution for Small Cell Cloud Computing,” 2015 IEEE VTC-Spring, May 2015, pp.
 Q. Zhu et al., “Task Offloading Decision in Fog Computing System,”
China Commun., vol. 14, no. 11, Nov., 2017, pp. 59–68. DOI: 10.1109/
 D. Happ and A. Wolisz, “Towards Gateway to Cloud Off loading in IoT Publish/Subscribe Systems,” 2017 2nd Int’l. Conf. Fog and Mobile Edge Computing, May 2017, pp. 101–06.
 T. Mitchell, Machine Learning, Ed. Science/Engineering/Math, Portland, OR,
 S. S. I. Samuel, 2016, Mar., “A Review of Connectivity Challenges in IoT Smart
Home,” 2016 3rd MEC Int’l. Conf. Big Data and Smart City, pp. 1–4.
 R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, vol. 1,
no. 1, MIT Press, 1998.
 F. S. Melo, “Convergence of Qlearning: A Simple Proof,” Inst. Systems and
Robotics, Tech. Rep, 2001, pp. 1–4.
 L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement Learning: A
Survey,” J. Artif. Intell. Research, 1996 , vol. 4, 237–85.
Motahareh Mobasheri ([email protected]) received her
B.E. degree in information technology engineering from Semnan University, Iran, in 2013 and her M.S. degree in computer
networks from Amirkabir University of Technology (Tehran
Polytechnic), Iran, in 2015. Currently, she is a Ph.D. candidate
in the Information and Communication Engineering Department of Dongguk University, Seoul, Republic of Korea.
Yangwoo Kim ([email protected]) received his Ph.D.
degree from Syracuse University, New York, in 1992. He is
the corresponding author for this article and a professor at
Dongguk University. His research interests include parallel and
distributed processing systems, cloud computing, grid computing, and edge computing.
Woongsup Kim ([email protected]) received his B.E.
degree in computer engineering from Seoul National University in 1998, his M.S. degree in computer and information science from the University of Pennsylvania in 2001, and
his Ph.D. degree in computer science from Michigan State,
Lansing. Since 2007, he has been a faculty member of the
Department of Information and Communication Engineering
at Dongguk University. His research interests include software
engineering, web semantics, service oriented computing, and
IoT system design.
Figure 5. Average SBM of the fog node based on Q-learning,
greedy, and random algorithms for 80 IoT devices with priority increase of some devices of Fig. 4.
Table 1. Final results of the 1: Q-learning; 2: greedy; and 3: random approaches.
|Figure||Average SBM||Total time steps (s)||Performance (%)|
- Assignment status: Already Solved By Our Experts
- (USA, AUS, UK & CA PhD. Writers)
- CLICK HERE TO GET A PROFESSIONAL WRITER TO WORK ON THIS PAPER AND OTHER SIMILAR PAPERS, GET A NON PLAGIARIZED PAPER FROM OUR EXPERTS