AI_deation: A Creative Knowledge Mining Method for Design Exploration

Ideation is a core activity in the design process which begins with a design brief and results in a range of design concepts. However, due to its exploratory nature it is challenging to formalise computationally. Here, we report a creative knowledge mining method that combines design theory with a machine learning approach. This study begins by introducing a graphic design style classification model that acts as a model for the aesthetic evaluation of images. A Grad-CAM technique is used to visualise where our model is looking at in order to detect and interpret visual syntax, such as geometric influences and color gradients, to determine the most influential visual semiotics. Our comparative analysis on two Nordic design referents suggests that our approach can be e ffi ciently used to support and motivate design exploration. Based on these findings, we discuss the prospects of machine vision aided design systems to envisage concepts and possible design paths, but also to support educational objectives.


Introduction
Graphic design is the process and art of combining visual elements in order to convey a message or an idea through a specific medium. In order to effectively communicate this message, the graphic designer must have a strong understanding of the various design principles, which include balance, contrast, emphasis, movement, pattern, unity, and variety. Additionally, the designer must be aware of the different design elements, such as line, shape, color, texture, and space. Graphic design shares many of the same concerns with other design disciplines, such as architecture and industrial design, essentially being a problem solving procedure, with a focus on the use of visual communication elements and techniques [1] [2]. The graphic design workflow involves a detailed design brief, the context of the work, the message to be communicated and a target audience. The design process starts with a problem definition, followed by an ideation phase, * George Palamas. Email: gpa@create.aau.dk which is a critical phase to any design practice. Ideation is the designer's "safe space" where novel or even unconventional ideas can be nurtured. The purpose of the ideation phase is to transform the perception of a third party's requirements (input) into design decisions (output). An activity of the ideation phase is to search for references that could act as source of inspiration and guidance [3]. The outcome of this search is a clear understanding of the problem and a set of potential approaches to a solution, rather than a final word on best practices [1]. It is estimated that one third of the total ideation time is devoted to research in order to retrieve and interpret reference material [3]. Designers dedicate time to increase their visual repertoire by seeking, collecting and using visual material. It has been reported that having access to an extensive set of visual references could improve design solutions by both quality and novelty [4]. Thus, a visual repertoire is important but so is to have a visual media literacy in order to extract meaningful interpretations, in a variety of contexts. By visual repertoire we mean an extensive visual catalogue to refer to, mentally, digitally or physically. [3] By visual literacy we mean that designers have to be familiar with design principles coming from different bodies of studies, such as history of art, gestalt psychology and theory of aesthetics, semiotic analysis and critical or cultural studies, among others [2].

EAI Endorsed Transactions on Creative Technologies
AI and machine learning have recently been demonstrated to be capable of solving time-consuming tasks in graphic design. For example, machine learning can automatically generate layouts for documents or websites. Furthermore, machine learning can be used to create 3D models or illustrations from textual instructions.
It should be noted, however, that machine learning cannot replace human creativity in graphic design. Despite machine learning's ability to generate ideas, it is ultimately up to the human designer to decide which ideas to use. Hence, there is a need for a computational tool that can assist designers with design theory, such as classification of a style or genre and detection of design style characteristics.
However, all these applications can be used only after an initial idea has been established. Therefore, we devise an ideation tool, based on machine learning (AI_deation), that could speedup the research for indirect visual references in an attempt to increase the designers' visual repertoire and that could also support the designer with the exploration and interpretion of this material.
The goal of this study is to use machine learning to aid in the design process by providing a means of automatically detecting and interpreting visual syntax in images. We believe that this could be used to support and motivate design exploration, as well as to support educational objectives. By using this information, designers can make informed choices about which features to use or avoid.
First, we describe a classifier for Nordic design styles based on a pre-trained convolutional neural network (CNN), which can be used to detect the design style of a poster automatically, and second, a data visualization scheme that displays to the designer the important areas that can be interpreted as visual language for the purpose of designing a poster. As a means of visualizing what our model is observing, Grad-CAM is used as a representation of CNNs. An evaluation and interpretation of the visual syntax of a particular design style is provided by the visualization. Such an ideation tool can be used by designers to gain familiarity with particular design styles and features within a design style.
We developed this methodology to provide designers with clear direction when approaching design briefs during the ideation phase. A designer will then be able to identify styles and get examples related to them. This allows designers to maximize their ideation time, which enhances quality and novelty. This paper is organized as follows. Typical graphic design methodologies, nordic design specificities, and machine learning algorithms are discussed in section 2. In section 3, we will analyse the classification model and the visualization tool to inspect lower level visual features. 4 presents the visual inspection results along with an interpretation. Section 5 discusses various ways to improve.

Design styles: The Nordic Style case
A way to visually group different design principles is by grouping them by styles and movements. Design style is, on a lower level, a group of visual features such as particular color harmonies, geometric influences and compositional practices, while in a higher level can be defined as a group of principles and techniques on how to produce a specific visual outcome. Design style is a multi-dimensional construct which reflects a variety of influences: cultural and societal dynamics, environmental conditions, trending ideas and religion, all of them being modulated by the dominant art forms of this particular time and place. Different movements throughout history, refer to a number of well recognizable styles such as Art Deco, Bauhaus or De Stijl [2].
Nordic design include many different ideas and aesthetics that cannot be restricted to a single style, but rather characterize a general simplicity with a focus on functionality, a combination that creates comforting environments appealing to fast paced lifestyles. Nordic design can commonly be described based on visual appearance, using concepts such as minimalism, characterized by restrained colour palettes, simple forms and strict typography. But this is a rather simplified understanding. They argue that instead, Nordic design should be described as 'focused' since it is characterized by exactness and accuracy of expression. It eliminates the unnecessary and superfluous, putting the essential at its core. This thought process is for the sake of clarity and functionality, having as a consequence its distinctive visual expression [5].

Visual Language in Nordic Design School
It can be inferred from the description of Nordic Style, that the classification into a style can be dependent on a set of rules which in turns will affect the visual outcome and its aesthetics. Each particular style will have its own rules and will put visual aesthetics in different hierarchies. But this is a rather rigid way of looking at the design process. According to the Danish agency "Homework", they pride themselves on being able to create a big visual repertoire when 2 EAI Endorsed Transactions on Creative Technologies 07 2022 -11 2022 | Volume 9 | Issue 3 | e5 AI_deation: A Creative Knowledge Mining Method for Design Exploration they have to work in a new project. This repertoire is composed of a combination of modern inspiration, research, their understanding of design theory and retro references [5]. This is arguably one of the key components in the graphic design practice; being able to have an understanding of design theory and vast visual repertoire. According to "Stockholm Design Lab", a solid foundation and principles act as a guidance that we can always refer to while creating [5]. By having an overview of many options and tools, the designer will be more prepared than starting from scratch. By having a better understanding of styles and therefore, the visual codes and basic set of rules within that style, could prove useful for the designer to be better prepared when facing a design problem. It has to be considered though, that inspiration cannot be attributed only to specific design styles, but they can provide a starting point and save some research time. However, it's sometimes hard to tell where the inspiration comes from at the moment of creating a specific product due to the overwhelming amount of images we are exposed to. Therefore we cannot be absolutely sure about how great is the influence of what we see in the thought process and it isn't rare though, to not be absolutely sure of how influential is each stimulus we are faced to since we are cognitively limited when it comes to storing information when it is presented in overwhelming amounts [6]. The present study considers that using two Nordic exponents presents a challenge in terms of having less features to identify, considering the 'focused' fashion of the Nordic style. Additionally, it present us with the opportunity of being able to recognize precise features that make each exponent unique within the same style.

Classification of Image Styles
Image classification is a fundamental problem in computer vision and serves as a foundation to extract meaningful information from a raw image. However, the majority of these systems are concerned with the identification and labeling of the content of an image. Classification of images that are largely motivated by themes and styles is a more challenging task because it relies on abstractions which impose subjective interpretations. In their study, Obeso et al. [7] propose a CNN model to classify images of Mexican architectural styles into three categories: pre-hispanic, colonial and modern. A fourth class was introduced in order to classify images containing non-architectural content that would be discarded later. They also argue that human errors are possible in the annotation process, especially if the data was extracted from videos. Their model was able to classify various images of buildings into these three classes with an accuracy of 88.01 %. According to them, the size and quality of the data set affected the robustness of their model. [8] examined the positive impact of using transfer learning when the training target data size is small and demonstrated the robustness of their application in a facial emotion recognition application [9] proposed a movie genre classification system based only on images of movie posters. The system could be potentially used to support the ideation phase of graphic design. As an example, the system can suggest poster design examples by finding similar posters belonging to similar movie genres. [10] presented a deep learning model for the aesthetic evaluation of images. Unlike other solutions focusing solely on handcrafted image features, the effectiveness of their work was not hindered by the need of experts with a deep understanding of the aesthetic criteria.
LayoutGAN [11] describes a novel Generative Adversarial Network that synthesizes layouts by modeling geometric relationships between different types of 2D elements. LayoutGAN uses self-attention modules to refine the labels and geometric parameters of randomly-placed 2D graphic elements to produce a realistic layout.

Convolutional Neural Networks (CNNs)
CNNs have been used for image recognition since the 1980s [12,13]. to solve complex computer vision tasks such as image classification [14,15], and image captioning [16] among others. Their superior performance on complex visual tasks has made them the state-of-the-art method for object detection and image classification [17,18]. CNNs typically require a large amount of training data. However, the creation of a visual data-set is a complex task which requires accurate labeling and a thorough selection of images [12,14]. Due to the availability of large datasets such as the ImageNet, it is possible to train CNNs without having to create a data-set from scratch [14,19]. For example, ImageNet has over 15 million images, collected from the web, and labeled by humans in approximately 22.000 categories [14]. Different studies have reported the importance of a large data-set with properly labeled data [7,12]. In response to this, studies have shown that transfer learning methods can be used to counteract the lack of data [20,21]. Transfer Learning allows us to start with the learned features from a source domain, and adjust these features and perhaps the structure of the model to suit a new target domain [22].

Visualizing Design Style Language
The aim of this visualization is to elucidate which features are the most relevant and therefore allow us to get some insights about the features that make each class representative. The images under investigation 3 EAI Endorsed Transactions on Creative Technologies 07 2022 -11 2022 | Volume 9 | Issue 3 | e5 refers to two Nordic design styles that are compositions of illustrated elements. These are abstractions of real-life objects, human characters, animals, building, typography, among others. Abstraction is a commonly used practice in graphic design, which is the process of simplifying the original shape in order to preserve the representative features to enhance the recognition of the object or to enhance the impression of the observer [23,24]. Abstraction was widely used by Art Deco artists [25] (See Figure 1).
Class Activation Maps (CAM) have been used for visualizing complexity in CNNs. A CAM visualize the image regions that weighted for a given category to classify an image in that specific class. CAM has been proven to be effective for other uses such as concept discovery. The CAM procedure requires the use of CNNs without going through the Fully Connected Layers (FC) due to the fact that the convolutional layers have been proven to act as object detectors and which localization data is lost in the FC [26]. They also suggest that FC shouldn't be used in order to reduce the number of parameters (by 90% according to their trials in VGGnet), and instead use, on the convolutional feature map, a global average pooling (GAP) followed by a fully-connected softmax layer. GAP acts as a regularizer, preventing overfitting, and allows the network to retain the localization of deep features until its final layer. While other studies suggest the use of global max pooling (GMP) instead of GAP [27], but this procedure ends up getting only the maximum values and therefore not considering the pixels with lower values that actually form the whole object to be detected. CAMs can identify the discriminative areas by projecting back the weights of the output layer on to the convolutional feature map, based on a heatmap, where the most important areas of the image are usually red -yellow, depending on the colormap used. Basically CAM is a linear weighted sum of all the locations of the relevant features present in the image for a specific class. As mentioned previously, CAM requires changing the architecture of the model which some author reported changing the accuracy of their trials by 1-2% [26], therefore compromising their accuracy and constraining them to work with certain architectures that for example do not allow to do VQA and image captioning. Moreover, it has been described as a visualization approach that allows us to use offthe-shelf model architectures [28]. Grad-CAM uses the gradient from the specific classes of the CNN, specifically from the last convolutional layers. The gradient tells the network in which direction the loss function increases and therefore in which direction the performance of prediction is poorer by assigning importance values to each neuron for the particular decision to be taken. They also argue that Grad-CAM is class-discriminative and able to localize relevant image regions but lacks the capability of showing fine details. They propose that by using Grad-CAM and guided back-propagation it's possible to combine the best of both worlds. Grad-CAM has been used to spot bias in datasets, presenting nurses as females and doctors as males, so basically being biased by gender stereotypes [28].

A recommender System as an ideation tool
A concept is developed through an exploration activity that starts with a design brief and results in a range of divergent design concepts. It is, however, difficult to formalize computationally because of its exploratory nature.
[29] An ideation model combining two computational approaches was proposed to bring inspiration. An ideation network retrieves cross-domain associations, which are then visualized in a semantic graph. Second, a generative adversarial network (GAN) is used to learn images that present two distinct concepts and then output a new synthesized image. [30] proposed an interactive ideation support system based on cooperative contextual bandits (CCB). Based on an exploration phase, this approach can suggest inspirational material using machine learning. Furthermore, it can explain its suggestions to aid in reflection. All relevant inspirational materials were curated into collages using digital mood boards.
In light of the above, we conclude that a recommender system based on graphic design style classification would be able to provide designers with suggestions for new design concepts, as well as possible design paths to explore. Additionally, this approach could be used to support educational objectives, such as helping students to develop an understanding of design styles. By familiarizing themselves with the style recognized and proposed by the model and identifying its influences according to its style-specific characteristics, it may be possible to increase the designers' visual repertoire and aid the research process.
However, this tool should aid the research process and wouldn't in any case change the role of designers and art directors since the designers still need to execute and choose which elements from the reference provided should be used in their new design themselves.

AI_deation: Design and Implementation
There are two main parts to this section. In the first part, the dataset generation process and the development of a pre-trained design style classifier are described. The second part describes the Grad-Cam visualization approach used to highlight the most significant style features.

Two Nordic Designers: Marimekko & Mads Berg
Marimekko is specializing in 2D pattern design printed into different substrates on a wide variety of products such as home decor and clothing. Therefore, this class' dataset contains several images depicting real world objects, and therefore adding a 3rd dimension to these patterns. There are two reasons that led us to use this design school. From a designers perspective, Marimekko is very popular in Finnish culture and has been present since 1951 [31]. It offers contrasts of colorful versus monochromatic designs, abstract versus figurative, yet all aiming to bring nature and the rural closer to the users, in a bold fashion, rethinking and redesigning Finnish nature in their products [32]. They manage to be true to the Nordic design style by carefully selecting and focusing the elements to be part of their patterns, as well as the context of the use of their products [32]. From a technical point of view, we wanted to challenge our model with the increased complexity of real world settings. Our second exponent is Mads Berg, known for his vintage illustrations and modern art-deco style. Posters are represented in this dataset as 2D images. Among Mads' sources of inspiration are classic paintings, avantgarde design movements and new art currents. Similar to the designers previously mentioned, he describes Nordic design as choosing few elements carefully in a composition without getting boring or vague results. It's about being bold and sometimes using humor or surrealism [33].

Part 1: Nordic design classification model
Dataset Generation. For good model performance, it is very important to have a good dataset with a large number of high-quality images. A web scraping technique was used to harvest images from the web since there were no datasets for these two designers.
Two labeled classes were formed (Figure 2, representing Mads Berg and Marimekko respectively, with the following dataset composition: 300 training images, 100 validation images and 100 test images per class. Data augmentation. Small datasets are insufficient to form strong feature relationships within deep CNN models. A popular method to tackle this problem and improve training accuracy is the use of data augmentation techniques [34]. Image augmentation can be achieved by applying a variety of transformations to a limited dataset (Figure 3. The best way to transform raw images is to apply affine and elastic transformations, such as zooming, rotating, translating, flipping, stretching or warping. As a result, the dataset become significantly larger due to the added variation.   RGB images of size 224x224 [35]. The model uses 3x3 filter kernels for the convolutional layers, and 5 Maxpooling layers of 2x2 filter size. Two fully connected layers are used in the model along with a softmax output activation function (Figure 4). Model name (VGG) refers to the Visual Geometry Group, and 16 represents the number of layers. There are approximately 138 million parameters in the network [35]. LeCun et al. demonstrated the model's generalization capacity on a range of tasks with a variety of datasets, matching or outperforming a more complex recognition pipeline built around a shallower architecture [13].

VGG-16 Model Architecture. VGG-16 is a Deep Convolution Neural Network (DCNN) model, trained on
Transfer learning and fine tuning. Yosinski et al investigated the possibility of transferring visual features into deep neural networks. In their study, they found that the lower layers operate as conventional computer vision feature extractors, such as edge detectors, while the final layers operate on task-specific features [36]. Han et al. suggest that the first-layer feature detector learning is more general, which makes it applicable to a variety of datasets and tasks. Eventually, features progress from general to specific through the model's last layers [34]. Figure 5 illustrates the model architecture and methodology.  VGG16 is the backbone model of our system 5. In order to utilize the power of the pre-trained model, the convolutional layers were frozen and the last two layers were unfrozen so that their weights could be updated. In order to ensure that the model's update does not undermine the pre-trained features, the model was trained using low learning rates. Finetuning should also be conducted at a low learning rate [34]. Furthermore, dropouts have been introduced to the model for regularization, masking 3% of the dense layers. Because the classifier is binary, a sigmoid activation function was used instead of softmax activation. 6 EAI Endorsed Transactions on Creative Technologies 07 2022 -11 2022 | Volume 9 | Issue 3 | e5

Improving the Model Performance
This section suggests a number of improvements to address overfitting and enhance overall performance. These efforts focused primarily on improving the quality of the dataset. Two conditions were considered. First, to increase the volume of the dataset based on augmentation techniques [37] which is suitable to counteract a low to medium overfitting. Second, to increase the quality of the dataset, which can contribute to better model accuracy [20], by carefully selecting the most representative items and rejecting the unrelated ones [7]. The improved dataset had the same size and composition with the original dataset.

Visualizing Design Style Features
Implementing Grad-Cam for transfer learning models is challenging because of the way sequential models are implemented. The VGG16 model was used as a global image feature detector, along with two more dense/connected layers. When implementing Grad-Cam, this approach presents a disadvantage, since Grad-Cam requires access to the convolutional layers, and these layers of the VGG16 cannot be accessed directly. The VGG16 model was therefore treated as an inner model, and Grad-Cam's convolutional/pooling layer was determined directly from the inner model.

Results
Using both the original dataset and the improved dataset, we checked for overfitting, instability, and overall performance after training the model. According to Figure 7, 95% accuracy was observed in both training and validation datasets, which may indicate overfitting. Furthermore, the training loss graph 7 shows a rapid decrease in loss, whereas the validation loss fluctuates slightly and tends to increase. Overfitting is more apparent after the 25th epoch.

Discussion
Comparing the error and accuracy graphs for the initial and conditioned dataset, it is evident that the later depicts better overall behavior and stability (See Figure 8). The prediction accuracy was tested with 100 randomly selected images. As a result, the model depicted a testing accuracy of 77%. Improving the quality of the dataset yielded the best performance, resulting to increased accuracy and better generalization, without overfitting. The above results underlines the importance of a good quality dataset but also stresses out the importance of the human, as a content mediator, to improve a small dataset.
Both models reached a high accuracy of 95% outperforming the results achieved by Obeso et al. [7], Validating the hypothesis that transfer learning is indeed a good option for a very specific domain task such as classifying different graphic design styles. Marimekko was assigned to more images (both true positive and false positive), which might have been  affected by the detection of shapes such as flowers in the dataset, as ImageNet includes a sub-tree for flowers [19]. The "Unikko" design, Marimekko's famous poppy pattern, was present in the dataset in different variations and colors, thus the dataset included a recurring pattern of flowers that could possibly induce a bias. In the output images generated by Grad-CAM, it is evident that the model incorrectly classified "Mads Berg" images that contain flowers as "Marimekko". The heat map highlighted the flowers, confirming the theory that the model is biased towards them. The bias could be eliminated by balancing the number of designs with flowers in both classes as previously applied in [28] to help eliminate a gender-biased dataset.

Design Considerations
From a graphic design perspective, it makes sense that the model is learning to identify visual elements from nature since Marimekko's patterns are inspired by Finnish nature. As can be seen in ( figure 10) the heat-map overlay focuses on flowers and the round eye-glasses, an indication that the model is trained to detect circular shapes as Marimekko features. The third Marimekko image wrongly classified as Mads Berg (See Figure 10 because of the 'edgy' composition and various color gradients caused by environmental lighting). The Mads Berg associated heat maps show a focus on sharp shapes and angles.
Marimekko's patterns are inspired by Finnish nature, so it makes sense that the model is learning to identify visual elements from nature. There is a focus on sharp shapes and angles in the Mads Berg heat maps (see fig. 9). In addition, Marimekko heatmaps seem to favor clusters of objects, curvatures, and circular shapes such as flower abstractions and smooth color gradients.
Regarding the visual inspection of Mads Berg work, can be observed that the model was able to detect the low level visual syntax such as sharp shapes and highly geometrical abstractions of human figures. From a graphic design perspective, it has a theoretical backing as Mads Berg's inspiration is the Art Deco style, which relies heavily on graphic elements by geometrizing its features [38]. Art Deco is characterized by the use of simplified shapes, bold silhouettes, bright colors, angular and geometric forms and color gradients giving the illusion of depth. All of these elements are present in Mads Berg's works. Using previously unseen Art Deco posters from the 1920s and 1930s, we evaluated the performance of both the classifier and heatmapbased visualization. The results can be seen on Figure  6. The model correctly detected the shapes, colors, and overall geometry of the design. The main difference between the training corpus and the test image is the font used on the title. However, it correctly identified the shapes and overall geometry of the design. The model ignored the text, which contrary to the dominant features of this design style is not a common visual grammar feature across all Art Deco posters, i.e. the text usually appears organically instead of geometrically.
The Grad-Cam is a powerful tool for detecting general features specific to each design style, showcasing a promising opportunity for the development of exploratory tools for educational and design conceptualization purposes. The results shows the importance of shapes, and their use by different designers, as definitive visual grammar characteristics. Considering a future iteration of development, the application should employ more design styles on a broader range of datasets to enable a more comprehensive analysis of  Moreover, by including a variety of designers belonging to a specific style would be able to receive recommendations about designers or art styles that share similar features, e.g. De Stijl to Bauhaus or Modernism to Art Deco. Thus AI_ deation could be used to explore genealogies of design styles or multiple influences from a single design.

Conclusions
A computational approach to early design was demonstrated in this study, empowering the designer rather than eliminating their role. AI_deation can facilitate intuition, which is important for conceptual decisionmaking when a large number of possible solutions are present. Furthermore, the system can support educational innovation through active exploration of design components. Students and inexperienced designers can strengthen their design literacy and understanding with minimal supervision.
This study was primarily focused on graphic design. According to Grad-CAM visualizations, our model tends to interpret natural elements as Marimekko, While Mads Berg's use of geometric shapes is interpreted as a reference to Art Deco. This is somewhat expected, as Marimekko is a Finnish design company known for its bold use of colors and patterns and inspiration from nature, while Art Deco is a 20thcentury art movement inspired by industrialization and is characterized by geometric shapes and contradicting colors. The fact that our model is interpretable and able to provide insights into how it makes its predictions is a very useful and consistent finding, as it suggests that a graphic designer can understand and benefit from